Skip to main content

Full text of "The Uncertainty Principle & Foundations of Quantum Mechanics"

See other formats




(A tt 

O 3 


of Quantum 

Edited by 

William C.Price 
Seymour S. Chissick 



The Uncertainty Principle 
and Foundations of 
Quantum Mechanics 

A Fifty Years' Survey 

Edited by 

William C. Price, f.r.s. 
Wheatstone Professor of Physics, 
University of London King's College 

Seymour S. Chissick 

Lecturer in Chemistry, 

University of London King's College 

A Wiley-Interscience Publication 


Chichester • New York ■ Brisbane ■ Toronto 


£ lo ,12 UriC 
2 i SEP 1980 

Copyright © 1977, by John Wiley & Sons, Ltd. 

Reprinted February 1979 

All rights reserved. 

No part of this book may be reproduced by any means, nor 
transmitted, nor translated into a machine language without the 
written consent of the publisher. 

Library of Congress Cataloging in Publication Data: 

Main entry under title: 

The Uncertainty principle and foundations of quantum mechanics. 

'A Wiley-Interscience publication.' 

'A tribute to Professor Werner Heisenberg to commemorate 
the fiftieth anniversary of the formulation of quantum mechanics.' 

1. Quantum theory — Addresses, essays, lectures. 2. Heisenberg 
uncertainty principle — Addresses, essays, lectures, I. Price, William 
Charles, 1909- . II. Chissick, Seymour S. III. Heisenberg, 
Werner, 1901-1976. 
QC174.125.U5 530.17 / 76-18213 

ISBN 471 99414 6 

Set on Linotron Filmsetter and Printed in Great Britain 
by J. W. Arrowsmith Ltd., Bristol 

A tribute to 

Professor Werner Heisenberg to commemorate 
the fiftieth anniversary of the formulation of 
quantum mechanics. 

! | 

s t 

List of Contributors 

Bohm, David 
Clarke, Christopher J. S. 
Detrich, John, H. 
Feldman, Gordon 
Gudder, Stanley P. 
Heisenberg, Werner 
Hodgson, Peter E. 
Kraus, K. 

Kuryshkin, Vassili V. 
Lanz, Ludovico 
Ludwig, Guenther 
Mignani, Roberto 
Papp, Erhardt W. R. 
Ratner, Mark A. 
Rayski, Jacek M. 

Rayski, Jerzy 
Recami, Erasmo 
Reece, Gordon 
Roothaan, Clemens C. J. 
Ruhl, W. 
Rylov, Yuri A. 
Sabin, John R. 
Santhanam, Thalanayar S. 
Shwianowski, Jan J. 
Stenger, William 
Streit, Ludwig 
Tassie, Lindsay J. 
Trickey, Samuel B. 
Van Horn, Hugh M. 
Yunn, B. C. 





A remarkable factor in the progress of science is the temporary concentration 
of interest on particular topics. Science is above all a social activity, the picture 
of the lonely scientist being largely a figment of an untutored imagination. 
Scientists hunt in packs and follow scents. Sometimes the scent is material: 
when the progress of technology has opened up a whole new method of 
experimental work, and the ways of using this newly available technique, the 
assimilation of the results obtained, the formulation of novel hypotheses and 
their means of being tested all attract a large pack of experimenters and 
theorists that in full cry produces astonishingly rapidly a large and novel output. 
On other occasions the scent is intellectual, when an awkward question has 
been asked and many try to find at least partial answers to it, answers that can 
often lead to fruitful insights and new and vital problems. 

One characteristic of this pack hunting is that if a new theory leads in rapid 
succession to numerous and varied experimental tests, each passed with 
honours, each leading to yet newer applications, then hardly anybody will stop 
to examine critically and logically the philosophy and internal consistency of 
the theory. Nobody will want to do so, because if he finds no flaw, his work will 
be regarded as insignificant, while if he does find a flaw, his papers will be 
brushed aside with the comment: "The theory works, so there must be some 
fault in his argument. Why waste time to sort it out when there are so many 
more fascinating things to be done?' Thus foundations cannot be coolly 
examined until well after the main part of the pack has passed the site of the 
excavations, many years later. 

This volume brings together many illuminating phases of one of the most 
exciting and successful hunts in history, the formulation of quantum theory. 
Not only was this hunt outstanding in the range and wealth of experimental 
data it covered (including the extension of the applicability of a theory founded 
on the spectroscopy of atoms to nuclear physics), but also in its philosophical 
implications. Some of them appeared early, some were later grossly misunder- 
stood and indeed exaggerated, but many are only now starting to be fully 
explored and begin to come into focus because only now is the site of the 
excavations sufficiently unencumbered to allow for deep investigations into 
problems of foundations. 

x Dedication 

Nothing could be more appropriate for such a hunt than to start with 
Heisenberg's own description of the origin of his celebrated uncertainty 
relations. Now that he is — alas — no longer with us, particular value attaches to 
this recent recollection of the most formative phase of modern physics by one 
of its foremost figures. 

It is splendid to observe from the many contributions in the first two parts of 
the volume how lively and active the subjects of foundations and of measure- 
ment theory now are in so many parts of the world. The final two parts deal with 
novel aspects of formal theory and of applications, where again we live in a 
vigorous period of activity. 

I trust that many will find this book thought provoking, enjoyable and indeed 





Bundesminister fur Forschung und Technologie, F.D.R. 

Great achievements on the part of researchers are often the result of their 
having had the courage to leave familiar ground and to explore genuinely 
unknown fields. The discoverer of the quantum theory and the uncertainty 
principle was required to leave the solid ground of classical physics. One of the 
most significant changes in our comprehension of the universe — a change 
which is reflected in fields far removed from physics — was wrought by the 
departure from the determinacy of physical phenomena and by far deeper- 
reaching relativization of the law of causality. The quantum theory and the 
uncertainty principle are discoveries which have changed the basis of our way 
of thinking. We still cannot foresee their ultimate consequences. 

Werner Heisenberg, whose passing we mourned during the preparation of 
this book, displayed not only the courage to leave the familiar terrain of 
classical physics. He also possessed the spirit to defend that which has been 
established as true in his field of science against nationalism and racism, even in 
the face of the most bitter political oppression. Both during and after the 
Second World War, he was therefore a guarantor of another Germany which 
desired peace and reconciliation among the peoples of the world. 




The first thirty years of the twentieth century saw an explosive development in 
the physical sciences, the like of which it is improbable we shall see again. Many 
of the discoveries were made by comparatively young men and this has 
provided opportunities for the international scientific community to com- 
memorate the fiftieth anniversary of some of the more fundamental discoveries 
during the lifetimes of their discoverers. This book, dedicated to Professor 
Werner Heisenberg, is one in a series of books, each designed as a tribute to 
one of the founders of modern physics. While the book was organized with the 
cooperation of Professor Heisenberg, it is with deep regret that we learned of 
his death on 15th February 1976, at the age of 74, just before going to press. 

This book commemorates the formulation by Heisenberg in the Spring of 
1925 of the system of mechanics known as quantum (or matrix) mechanics. The 
subsequent development of quantum mechanics by Heisenberg with Max Born 
and Pascual Jordan provided the basis for modern physics. One of Heisen- 
berg's best known and far reaching contributions to the understanding of 
quantum mechanics was his Uncertainty Principle, which limits the precision of 
measurement of the dynamic variables of a system. 

While Heisenberg's decisive contribution to physics, for which he received 
the Nobel Prize in 1932, was made at the age of 24, he continued to advance 
knowledge over a wide range of subjects: nuclear and sub-nuclear physics, 
S-matrix theory, solid state theory, plasma and thermonuclear physics, unified 
field theory, etc. 

In compiling this volume, the editors have again been fortunate in securing 
the help and cooperation of scientists throughout the world. The aims were 
essentially similar to those of Wave Mechanics, the First Fifty Years (a tribute to 
Professor Louis de Broglie on the fiftieth anniversary of the discovery of the 
wave nature of the electron); to review aspects of the philosophical implica- 
tions, past and current thinking and potential future developments in physics 
stemming from the fundamental discoveries associated with, in this case, 
Werner Heisenberg. 

The Editors wish to record their thanks to the University of London King's 
College, for the facilities provided and to Professor David Bohm, Dr. R. J. 
Griffiths and Dr M. P. Melrose for reading various sections of the manuscript 
and for making helpful comments. 

February 1976 

William C. Price, F.R.S. 
Seymour S. Chissick 

University of London King's College 




1 Remarks on the Origin of the Relations of Uncertainty 3 

Werner Heisenberg 

2 In Praise of Uncertainty 7 

Gordon Reece 

3 On the Meaning of the Time-Energy Uncertainty Relation 13 

Jerzy Rayski and Jacek M. Rayski, Jr. 

4 A Time Operator and the Time-Energy Uncertainty Relation 21 

Erasmo Recami 

5 Quantum Theory of the Natural Space-Time Units 29 

Erhardt W. R. Papp 

6 Uncertain Cosmology 51 

Christopher J. S. Clarke 

7 Uncertainty Principle and the Problems of Joint 
Coordinate-Momentum Probability Density in Quantum 
Mechanics 61 

Vassili V. Kuryshkin 


8 The Problem of Measurement in Quantum Mechanics 87 

Ludovico Lanz 

9 The Correspondence Principle and Measurability of Physical 
Quantities in Quantum Mechanics 109 

Yuri A. Rylov 

10 Uncertainty, Correspondence and Quasiclassical Compatability 147 

Jan J. Slawianowski 

xvi Contents 

11 A Theoretical Description of Single Microsystems 189 

Guenther Ludwig 

12 Quantum Mechanics of Bounded Operators 227 

Thalanayar S. Santhanam 


13 Four Approaches to Axiomatic Quantum Mechanics 247 

Stanley P. Gudder 

14 Intermediate Problems for Eigenvalues in Quantum Theory 277 

William Stenger 

15 Position Observables of the Photon 293 

K. Kraus 

16 A New Approach and Experimental Outlook on Magnetic 
Monopoles 321 
Erasmo Recami and Roberto Mignani 

17 Problems in Conformally Covariant Quantum Field Theory 325 

W. Riihl and B. C. Yunn 

18 The Construction of Quantum Field Theories 349 

Ludwig Streit 

19 Classical Electromagnetic and Gravitational Field Theories as 
Limits of Massive Quantum Theories 365 

Gordon Feldman 

20 Relativistic Electromagnetic Interaction Without Quantum Elec- 
trodynamics 395 

Clemens C. J. Roothaan and John H. Detrich 


21 The Uncertainty Principle and the Structure of White Dwarfs 441 

Hugh M. Van Horn 

22 Applications of Model Hamiltonians to the Electron Dynamics of 
Organic Charge Transfer Salts 461 

Mark A. Ratner, John R. Sabin and Samuel B. Trickey 


23 Alpha-Clustering in Nuclei 

Peter E. Hodgson 

Contents xvii 


24 Commutation Relations, Hydrodynamics and Inelastic Scattering 

by Atomic Nuclei 543 

Lindsay J. Tassie 

25 Heisenberg's Contribution to Physics 

David Bohm 

Author Index 
Subject Index 





Quantum Uncertainty Description 



Remarks on the Origin of the 
Relations of Uncertainty 

The late Professor WERNER HEISENBERG 

Director Emeritus of the Max Planck Institut fur Physik und Astrophysik, 
Munich, Germany 

The situation of quantum theory in the summer of 1926 can be characterized by 
two statements. The mathematical equivalence of matrix mechanics and wave 
mechanics had been demonstrated by Schrodinger, the consistency of the 
mathematical scheme could scarcely be doubted; but the physical interpreta- 
tion of this formalism was still quite controversial. Schrodinger, following the 
original ideas of de Broglie, tried to compare the 'matter waves' with elec- 
tromagnetic waves, to consider them as real, measurable waves in three- 
dimensional space. Therefore he preferred to discuss those cases where the 
configuration space had only three dimensions (one-particle systems), and he 
hoped, that the 'irrational' features of quantum theory, especially quantum 
'jumps', could be completely avoided in wave mechanics. The stationary states 
of a system were defined as standing waves, their energy was really the 
frequency of the waves. Born on the Other hand had used the configuration 
space of Schrodinger's theory to describe collision processes and he took the 
square of the wave amplitude in configuration space as the probability of 
finding a particle. So he emphasized the statistical character of quantum theory 
without attempting to describe what 'really happens' in space and time. 

Schrodinger's attempt appealed to many physicists who were not willing to 
accept the paradoxes of quantum theory; but the discussions with him in July 
1926 in Munich and in September in Copenhagen demonstrated very soon, 
that such a 'continuous' interpretation of wave mechanics could not even 
explain Planck's law of heat radiation. Since Schrodinger was not quite 
convinced it seemed to me extremely important to decide beyond any doubt 
whether or not quantum 'jumps' were an unavoidable consequence, if one 
accepted that part of the interpretation of matrix mechanics, which already at 
that time was not controversial, namely the assumption that the diagonal 
element of a matrix represents the time average of the corresponding physical 
variable in the stationary state considered. Therefore I discussed a system 
consisting of two atoms in resonance. The energy difference between two 
specified consecutive stationary states was assumed to be equal in the two 

4 Uncertainty Principle and Foundations of Quantum Mechanics 

atoms so that for the same total energy the first atom could be in the upper and 
the second in the lower state or vice versa. If the interaction between the two 
atoms is very small one should expect that the energy goes slowly forth and 
back between the two atoms. In this case it can easily be decided whether the 
energy of one of the atoms goes continuously from the upper to the lower state 
and back again or discontinuously by means of sudden quantum jumps. If E is 

the energy of this one atom then the mean square of fluctuations A£ is quite 
different in the two cases [equation (1)]. The calculation does not require more 
than the non -controversial assumption of matrix mechanics mentioned above. 
The result decided clearly in favour of the quantum jumps and against the 
continuous change. 


AE 2 = (E-E) 2 -- 



The success of this calculation seemed to indicate, that the non-controversial 
part of the interpretation of quantum mechanics should already determine 
uniquely the complete interpretation of the mathematical scheme, and I was 
convinced that there was no room left for any new assumptions in the 
interpretation. In fact, in the example mentioned above the square of the 
elements of that matrix, which transformed from the state where the total 
energy of the system was diagonal to the state where the energy of the one atom 
was diagonal, had to be considered as the corresponding probability. In the 
autumn of 1926 Dirac and Jordan formulated the theory of those general linear 
transformations which corresponded to the canonical transformations of classi- 
cal mechanics and which nowadays are called the unitary transformations in 
Hilbert space. These authors correctly interpreted the square of the elements 
of the transformation matrix as the corresponding probability; this was in line 
with Bom's older assumptions concerning the square of Schrodinger's wave 
function in configuration space and with the example of the resonating atoms. 
It was in fact the only assumption which was compatible with the old non- 
controversial part of the interpretation of quantum mechanics; so it seemed 
that the correct interpretation of the mathematical theory had finally been 

But was it really an interpretation, was the mathematical scheme a theory of 
the phenomena? In physics we observe phenomena in space and time; the 
theory should enable us, starting from the present observation, to predict the 
further development of the phenomenon concerned. But at this point the real 
difficulties started. We observe phenomena in space and time, not in configura- 
tion space or in Hilbert space. How can we translate the result of an observation 
into the mathematical scheme? E.g. we observe an electron in a cloud chamber 
moving in a certain direction with a certain velocity; how should this fact be 
expressed in the mathematical language of quantum mechanics? The answer to 
this question was not known at the end of 1926. 

For some time Schrodinger had discussed the possibility, that a wave packet 
obeying his wave equation could represent an electron. But as a rule a wave 
packet spreads out so that after some time it may be extended over a volume 

Heisenberg 5 

much bigger than that of the electron. In nature, however, an electron remains 
an electron; so this interpretation would not do. Schrodinger pointed out, that 
in one special case, the harmonic oscillator, the wave packet did not spread; but 
this property had to do with the special fact, that for the harmonic oscillator the 
frequency does not depend on the amplitude. 

On the other hand there could be no doubt that de Broglie's and 
Schrodinger's picture of the three-dimensional matter waves did contain some 
truth. In the many discussions we had in Copenhagen during the months after 
Schrodinger's visit it was primarily Bohr who emphasized this point again and 
again. But what does this term 'some truth' mean? We had already too many 
statements which contained 'some truth'. We could, for example, compare the 
statements: 'The electron moves in an orbit around the nucleus.' 'The electron 
moves on a visible path through the cloud chamber.' 'The electron source emits 
a matter wave which can produce interferences in crystals like a light wave.' 
Each of these statements seemed to be partly true and partly not true, and 
certainly they did not fit together. We got the definite impression that the 
language we used for the description of the phenomena was not quite adequate. 
At the same time we saw that at least in some experiments such concepts as 
position or velocity of the electron, wavelength, energy had a precise meaning, 
their counterpart in nature could be measured very accurately. It turned out 
that for a well defined experimental situation we finally always arrived at the 
same prediction, though Bohr preferred to play between the particle- and 
wave-picture while I tried to use the mathematical scheme and its probabilistic 
interpretation. Still we were not able to get complete clarity; but we understood 
that the 'well defined experimental situation' somehow played an important 
role in the prediction. 

In the beginning of 1927 I was for some weeks alone in Copenhagen, Bohr 
had gone to Norway for a skiing holiday. In this time I concentrated all my 
efforts on the question: How can the path of an electron in a cloud chamber be 
represented in the mathematical scheme of quantum mechanics? In the despair 
about the futility of my attempts I remembered a discussion with Einstein and 
his remark: 'it is the theory which decides what can be observed'. Therefore I 
tried to turn around the question. Is it perhaps true that only such situations 
occur in nature or in the experiments which can be represented in the 
mathematical scheme of quantum mechanics? That meant: there was not a real 
path of the electron in the cloud chamber. There was a sequence of water 
droplets. Each droplet determined inaccurately the position of the electron, 
and the velocity could be deduced inaccurately from the sequence of droplets. 
Such a situation could actually be represented in the mathematical scheme; the 
calculation gave a lower limit for the product of the inaccuracies of position and 

It remained to be demonstrated that the result of any well defined observa- 
tion would obey this relation of uncertainty. Many experiments were discussed, 
and Bohr again used successfully the two pictures, wave- and particle-picture, 
in the analysis. The results confirmed the validity of the relations of 

6 Uncertainty Principle and Foundations of Quantum Mechanics 

uncertainty; but in some way this outcome could be considered as trivial. 
Because if the process of observation itself is subject to the laws of quantum 
theory, it must be possible to represent its result in the mathematical scheme of 
this theory. But these discussions demonstrated at least that the way in which 
quantum theory was used in the analysis of the observations, was completely 
compatible with the mathematical scheme. 

The main point in this new interpretation of quantum theory was the 
limitation in the applicability of the classical concepts. This limitation is in fact 
general and well defined; it applies to concepts of the particle picture, like 
position, velocity, energy, as well as to concepts of the wave picture like 
amplitude, wave length, density. In this connection it was very satisfactory that 
somewhat later Jordan, Klein and Wigner were able to show that Schrodinger's 
three-dimensional wave picture could also be subject to the process of quanti- 
zation and was then — and only then — mathematically equivalent to quantum 
mechanics. The flexibility of the mathematical scheme illustrated Bohr's 
concept of complementarity. By this term 'complementarity' Bohr intended to 
characterize the fact that the same phenomenon can sometimes be described by 
very different, possibly even contradictory pictures, which are complementary 
in the sense that both pictures are necessary if the 'quantum' character of the 
phenomenon shall be made visible. The contradictions disappear when the 
limitation in the concepts are taken properly into account. So we spoke about 
the complementarity between wave picture and particle picture, or between 
the concepts of position and velocity. In later literature, there have been 
attempts to give a very precise meaning to this concept of complementarity. 
But it is at least not in the spirit of our discussions in the Copenhagen of 1927 if 
the unavoidable lack of precision in our language shall be described with 
extreme precision. 

There have been other attempts to replace the traditional language of 
physics with its classical concepts for the description of the phenomena, by a 
new language which should be better adapted to the mathematical formalism of 
quantum theory. But the development of language is a historical process, and 
artificial languages like Esperanto have never been very successful. Actually, 
during the past 50 years, physicists have preferred to use the traditional 
language in describing their experiments with the precaution that the limita- 
tions given by the relations of uncertainty should always be kept in mind. A 
more precise language has not been developed, and it is in fact not needed, 
since there seems to be general agreement about the conclusions and predic- 
tions drawn from any given experiment in this field. 

In Praise of Uncertainty 


Imperial College, London 



The first post-natal experiences of a human being are necessarily associated 
with learning about the world in which he or she lives. Ideally, his emotional 
needs will be satisfied in much the same way as his physical requirements. 
Indeed, these various aspects are inextricably intertwined, centring on the 
mother's breast, which supplies at once food, warmth, reassurance and com- 

From the point of view of a very young baby, the idea of contentment cannot 
be separated from his confidence in the consistency and reliability of the world 
as he sees it. For him, happiness means the certainty that his food will arrive 
when he needs it, at the correct temperature and of a reliable composition. 

Later he becomes aware of non-animate objects, some of which fail to 
interact with him (passive objects like floors and walls), while others (like 
mattresses, blankets and rattles) respond when pushed or shaken. Gradually, a 
baby builds up a library of objects in which he can have confidence. Floors can 
safely be crawled on; thin air cannot. Walls can be bumped without apparent 
damage (to the walls) while balls and bottles roll away when pushed. He learns 
to categorize the objects around him. Fine gradations are learned from the 
varying degrees of, for example, softness of floor coverings, and intensities of 
light, noise and warmth. None of these distinctions, however, rivals the 
fundamental importance of simple 'yes/no' questions such as 'Am I hungry?' 
or 'Am I wet?' It is not until a baby is much older — say a year, when his feelings 
about the world will already have begun to gel — that he begins to confuse the 
issue with questions like 'Am I very hungry?' 

The real source of the baby's confidence in the external world is the certainty 
that if something is wrong it will be remedied. Uncertainty ('Where is Mum- 
my?', 'Where am I?' or 'Why am I still hungry?') represents insecurity, a loss of 
confidence in the external world and consequent unhappiness. The baby's 
confidence relies also on a belief in causality: 'If I cry, then Mummy will come', 
'If I get milk, then I shall no longer be hungry', and the action of crying 
represents this reliance. 

8 Uncertainty Principle and Foundations of Quantum Mechanics 

Eventually the child acquires the verbal skills to express his feelings, and to 
extend them by asking questions. No-one who has lived under the ceaseless 
questioning of a normal four-year-old child can have failed to detect consistent 
trends in the style of interrogation. Typically, one is asked questions like: 'Why 
can't we live upside down?' . . . followed by 'How long can you stand on your 
head?', 'Why can you stand on your head longer than me? Because you're older 
than me?' Such questioning is aimed at imposing a simple logical structure on 
seemingly haphazard phenomena. The 'simplest' structures, for this purpose, 
are perfect correlations amounting to causal relationships (age correlated with 
ability to stand on one's head). Each set of phenomena is dealt with more or less 
in isolation. Thus it is unusual for a child to follow a set of questions like these 
with, say, the question 'Why does being older make you better at standing on 
your head?' Some causal mechanism is taken for granted, and the exact details 
are not necessarily of interest. Far more likely is the catch question: 'Then why 
can't Granny stand on her head longest? She's older still.' Already, however, 
the child has sufficient confidence in causality that he will tend to dismiss odd 
exceptions to general rules, whether these rules are ones that he has thought up 
for himself or generally accepted truths, such as 'The older you are the wiser 
and taller you are'. It takes a lot of dwarves and imbeciles to convince a child 
that this is not always so. And the realization that Granny, despite her age, can 
no longer stand on her head does not appear to have even the briefest effect on 
the child's confident quest for definite causal connections. 

It is of little or no relevance to what extent the child's desire to impose a 
logical structure on the external world is in some sense innate, and to what 
extent it is a function of his upbringing. The only point of real significance is the 
universality of this desire, and its intensity. If it is a consequence of the direction 
of the child's thinking by the external world, and in particular by the adults in 
that world, this is a remarkable tribute to our ability to mould children in our 
image. There is, however, little evidence for such a view: it seems much more 
likely that we are born with a hefty predisposition towards a belief in causality 
and a desire for certainty. The most compelling evidence for this latter view is 
the fact that we can interpret the behaviour of animals in much the same way as 
we interpret the behaviour of human beings. We do not find any lack of 'logic' 
in the behaviour of chimpanzees, snakes or even amoebae. We do not need a 
special vocabulary to describe the intelligence of animals: indeed it is standard 
practice to use the behaviour of animals to help us understand ourselves. We 
assume that the same analysis as we know to be valid for human behaviour will 
give correct results when applied to other creatures: we are of course imposing 
our own preconceptions on their behaviour. Such methods have so far justified 
themselves by producing self-consistent results. 

As time passes, the child grows up, matures and begins to 'think for himself. 
It is an attribute of intelligent adolescents that they tend to question accepted 
values 'for the sake of it'. Unfortunately they have by this time lost their desire 
to challenge really fundamental 'truths' and their doubting has begun to take 
place within a well-defined framework of accepted authority and standard 

Reece 9 

techniques. Above all, they have the confidence that all questions have 
answers, and that most questions have exact answers. 

It is a mark of true maturity to be able to function in the absence of certainty: 
for example, the ability to be 'good' without the certainty of ultimate retribu- 
tion for one's wickedness. It is much easier to search for minor deviations from 
an accepted truth, and to suggest the appropriate minor modifications, than to 
search freely for correlations and to discover one's own truths. In practice, it is 
also a good deal slower and less efficient: hence the popularity of ready-made 
orthodoxies of every kind. 


It is clear that the desire for certainty and the belief in causality are not 
restrictions upon human thought imposed by the peculiar requirements of the 
external world but vice versa. In other words we can answer Eddington's 
disturbing question 'How much do our theories tell us about Nature, and how 
much do we contribute ourselves?' as follows: the very notion of causality and 
the desire for certainty are imposed by us on Nature. To pursue a metaphor due 
to Eddington, we are inclined to trawl the data of physics with a causal net. 
Small wonder, then, that we turn up just what we hope to find. For example, we 
tend still to use the vocabulary and methodology of classical physics when 
dealing with the phenomena of a submicroscopic world. We search for macro- 
scopic analogues, such as Bohr's atom, and set them up with such plausibility 
that they inevitably become obstacles to the further understanding of the very 
phenomena they purport to illuminate. Precisely because they are easy to 
understand in themselves the analogues tend to take on a life of their own. The 
first job of each succeeding generation of physicists is then to demolish the 
simplifications of their predecessors. The lay world, its representatives in the 
scientific establishment, and in times past even the Church, have all naturally 
thrown their weight behind conventional wisdom. The consequent emphasis on 
the destruction of bad old theories rather than the untrammelled construction 
of good new ones has hindered the development of science. In particular it has 
slowed down the process of acceptance of new theories by making them seem 
far more revolutionary than they really are. 

As with so many of the implicit 'values' of science, that of precision can be 
attributed to the Ancient Greeks. Their preoccupation with, for example, the 
problem of 'commensurability' is best explained in terms of a feeling on their 
part that commensurable quantities (rational numbers) were 'good' and irra- 
tionals were 'bad'. The alternative possibility, that they found irrationals too 
difficult to handle, is not particularly plausible, since most of the theorems 
proved in Greek mathematics for rationals hold equally for irrationals. Thus 
Archimedes established the rule for balancing weights on a lever for commen- 
surable ratios of weights, though the proof he gave did not of course require 
such an assumption. When the Pythagoreans proved the existence of irration- 
als, the notion of approximation was drawn into the vocabulary of physics. 

10 Uncertainty Principle and Foundations of Quantum Mechanics 

Once established, the Greek attitude to accuracy remained unchallenged for 
2,000 years. During that time, Christian civilization had imposed religious 
standards of 'truth' on science. By associating scientific truth with religious 
dogma, the Church unwittingly gave science a new importance. That impor- 
tance is still with us today: it stems from the need to establish new scientific 
theories beyond reasonable doubt before they could safely be taught. Once 
accepted, though, a theory wore a 'seal of approval', and could not easily be 

One dogma of science in which religion has a more than usually large stake is 
the idea of causality. If we do not need to seek a cause for apparently 
inexplicable events — such as the existence of the Universe — we do not need to 
turn to religion for the explanation. Moreover, once it is admitted that there are 
questions which not only need not but actually cannot be answered — such as 
'Where is that electron and how fast is it going?', 'What is the opposite of 
giraffe?' or 'Why did God create the world?' — it soon becomes evident there 
are whole realms of human experience which will continue to defy a simple, 
precise causal analysis. 

Plato's system of ideals, those 'absolute objects which cannot be seen other 
than by thought,' still underlies our attitude to mathematics and physics. We 
tend to think of real objects as imperfect ideal ones. Though no-one has ever 
found a perfectly smooth plane or an inviscid fluid, the theory of motion of 
solids and liquids treats friction and viscosity as unfortunate aberrations and 
irritations. It is more than 90 years since Rayleigh showed (Rayleigh, 1892) 
that the theory of viscous flow did not in general reduce to inviscid theory as the 
viscosity tended to zero: modern undergraduate mathematics has yet to 
acknowledge Rayleigh's discovery. 

It was not in fact until the late eighteenth and early nineteenth centuries that 
scientists felt free to challenge orthodox theories of precision. The most 
difficult to swallow of all the assertions of the Greeks was the parallel axiom of 
Euclid. In an age of rationalism it seemed only natural to put it to the test. 
Gauss took his instruments and set out to establish the truth of the parallel 
axiom by the only method he knew — that of direct measurement. The incon- 
clusiveness of his results opened up the road to non-Euclidean geometry. The 
notion that the angles of a triangle might add up to about 180 degrees rather 
than exactly 180 degrees was capable of overturning the whole edifice of 
certainty on which physics seemed to be built. The irrational numbers and the 
transcendentals could be treated as exceptional oddities, but a non-Euclidean 
world would make uncertainty pervade every aspect of physics. 

Laplace is commonly credited with having conjured up a demon capable of 
predicting the course of all subsequent events, given complete information on 
every particle in the universe at any one instant. Conventionally, this is 
regarded as the embodiment of rationalist overconfidence. But this can be read 
in precisely the opposite way: even if the world were purely causal it would 
require an impossibly well-informed demon to make proper use of the fact. 
Consequently, the world will necessarily seem unpredictable to us. Because we 

Reece 11 

cannot hope to obtain the necessary information, we must resign ourselves to 
an inability to predict the vast majority of phenomena. Thus, although this may 
not be a theoretical limitation, in practice it introduces a new level of uncer- 

The laws of thermodynamics, likewise first properly formulated at the 
beginning of the nineteenth century, tell us more about what we can not know 
or do. In particular, they rule out the possibility of a perfect machine, and hence 
of perpetual motion. Thus another aspect of perfection had to be abandoned. 


Gradually, therefore, imperfection, inaccuracy, unpredictability, uncertainty 
and randomness were accepted into physics. It is reasonable to relate this 
increasing tolerance with the growing maturity of science. By analogy with the 
growth of sophistication in the human being, we see that the history of science is 
the story of the realization that the world is not so simple as we should like it to 
be, that we cannot hope to achieve absolute certainty, and that we cannot hope 
to know or understand everything. Nor is it necessary to 'explain' everything 
that we do not understand, as the manifestation of some supernatural force, 
simply because we do not understand it. Belief in such dogmas is arrogance 
thinly disguised as humility. True humility in science consists in knowing that 
we do not know. 

The culmination of the acceptance of uncertainty came in the decade 
between 1925 and 1935. 1926-7 saw publications by Heisenberg identifying 
the inherent uncertainty associated with certain measurements. In 1931 
Goedel published his Ueber formal unentscheidbare Saetze der Principia 
Mathematica, showing that the axiomatic method itself had inherent limita- 
tions. In 1934 Popper published the Logik der Forschung, which showed that 
the nature of a scientific hypothesis required its falsifiability, and which finally 
demolished the notion of absolute scientific proof (or disproof). It was thus 
doubt and scepticism that distinguished the scientist, and not confidence and 

It is clear that the awareness of the fallibility of the tools they use has made 
scientists much more careful in the way they derive and present their results. 
For example, the interaction between observer and phenomenon is recognized 
as a crucial factor in any sociological investigation. Indeed, in all experimental 
work involving living creatures, the effect of the experiment itself — the pres- 
ence of experimenters and their measuring instruments — on the outcome is 
now recognized. 

Logically, the next step should be to review our approach to the publication 
of the results of scientific investigations. Success is essentially trivial: it is failure 
to detect a satisfactory, simple causal explanation of a phenomenon that 
stimulates speculation. Currently, scientific journals concentrate on the essen- 
tial work of cataloguing success. How much more exciting would be the 

12 Uncertainty Principle and Foundations of Quantum Mechanics 

publication of phenomena that defy correlation. It is 'anomalies' (such as that 
of the motion of the planet Mercury) that point the way out of inadequate 
theories and into the excitement of new fields. 

It is easy to point to analogies between the position of quantum mechanics in 
1926, and that of fundamental particle physics in 1976. It may well be that in 
order to resolve their current dilemma physicists may once again have to think 
the unthinkable, and challenge the very foundations of their subject. 


Strutt, J. W. [3rd Baron Rayleigh] ( 1 892) 'On the question of the stability of the flow of fluids', Phil. 
Mag., 34, 59-70. 

On the Meaning of the Time-Energy 
Uncertainty Relation 


Jagiellononian University, Cracow, Poland 

Soon after the formulation of the usual uncertainty relations between Car- 
tesian coordinates of particles and their momenta 

A* • Ap x — Ay • Ap y — Az • Ap z = h (1) 

there appeared the problem of the existence and meaning of a similar relation 
between time and energy 

AfAE=*h (2) 

constituting a natural extension of the three relations [given in equation (1)] of 
Heisenberg from the point of view of the special theory of relativity. Refer- 
ences to the energy-time uncertainty problems may be found discussed, for 
example, by Carruthers and Nieto (1968). 

The relation (2) is not derivable from the formalism of quantum mechanics in 
the same way as relations (1) were derived, neither can the interpretation of 
relation (2) be quite analogous to the ordinary interpretation of (1). 

First of all, in contradistinction to the coordinates x, y, z, the time variable t is 
not an operator associated with an observable characterizing the particle but is 
a universal parameter. Moreover, energy is not a generalized momentum 
canonically conjugate to the time variable t in the usual sense of this word. In 
consequence of the fact that t is not an observable but a parameter, the opinion 
of most physicists is not favourable towards the possibility of regarding the 
operator iti(d/dt) as the operator of energy, and is against the suggestion of 
relating the non-commutability of ihid/dt) and t with an impossibility of their 
'simultaneous' determination and, consequently, of an appearance of the 
uncertainty relation (2). Incidentally, it is not at all clear what is meant by a 
'simultaneous determination' of t and of any other physical quantity. Any 
measurement of a physical quantity at a given instant of time means a 
simultaneous determination of both this quantity and the time at that instant. 

For the sake of completeness it should be mentioned that time and energy do 
form a pair: a generalized coordinate and its canonically conjugated momen- 
tum within the framework of the so-called homogeneous canonical formalism, 


14 Uncertainty Principle and Foundations oi Quantum Mechanics 

but this formalism does not constitute a basis and starting point for quantiza- 
tion. The latter is'performed with the help of the ordinary canonical formalism 
where the energy, i.e. the Hamiltonian, is not to be regarded as one more of the 
generalized coordinates or momenta. 

It was argued that the Hamiltonian H rather than the operator ih(d/dt) plays 
the role of the operator of energy and— in order to formulate the fourth 
uncertainty relation— one should look for an operator which would play the 
role of a generalized coordinate, canonically conjugate to the Hamiltonian 
(taken to be a generalized momentum) or, vice versa, to look for a generalized 
momentum canonically conjugate to the energy (the latter regarded to be a 
generalized coordinate). But there are serious difficulties with defining such an 
operator. To show some of them let us limit our considerations to one 
space-like dimension x. The Hamiltonian for a free particle in the non- 
relativistic quantum mechanics is p 2 /2m, and the (formal!) operator f satisfying 
the relation 



[H,t] = -ih 

t = ^ + l x ) + f ix) 
~ 2\ p v I 


where/(x) is an arbitrary function of x. (There is a correspondence between the 
operator (4) and the classical time if one puts/(x) = and recalls the fact that in 
classical physics p = mv and for a free particle v = x/t.) However, the trouble is 
that the operator (4) is not well defined because the inverse of the operator p 
does not exist inasmuch as the spectrum of p includes the value zero. Thus, the 
domains of definition of the operators H and t given by (4) are not the same. 
Consequently, the operator H cannot play the role of a generalized coordinate 
correlated with a canonically conjugate momentum and it is impossible to 
derive the fourth uncertainty relation in an analogous way to that which was 
used to prove the usual three relations of Heisenberg. 

Recently Eberly and Singh (1973) claimed to have circumvented this diffi- 
culty by constructing a reciprocal time-operator. However, their determination 
of the fourth uncertainty relation has been achieved in a very round-about way 
so that we shall not present it here. In what follows, we will present another, 
very straightforward and direct derivation of this relation. 

Not only a derivation of relation (2) but also its interpretation in a way which 
is closely analogous to the interpretation of the usual relations (1) seems to be 
impossible. In fact, according to quantum mechanics, if the energy spectrum is 
discrete we may construct a stationary solution of the Schrodinger equation 
describing the system in an eigenstate of energy. In this case energy is exactly 
known for any time instant determined with an arbitrarily high precision Ar < e. 
But also in the case of a continuous energy spectrum it is possible to construct a 
solution so that the energy is determined up to an arbitrarily small uncertainty 
AE<e, and this solution remains almost stationary for a very long time 

Rayski and Rayski Jr. 15 

interval i.e. energy is known almost exactly for any instant (determined with an 
arbitrarily high precision) within a long time interval. This contradicts sharply a 
naive interpretation of formula (2) according to which At means uncertainty of 
the time instant at which the particle possessed an energy E known within the 
limits of inaccuracy AE. 

One could look for an excuse and explanation of the appearance of the 
above-mentioned difficulties with the uncertainty relation (2) in the fact that 
the ordinary quantum mechanics is a non-relativistic theory. Being in disaccord 
with the requirements of relativity quantum mechanics is treating the time 
coordinate on a different footing as compared with the three space-like 
coordinates of the particles. This may constitute a reason why relation (2) does 
not hold true in the ordinary quantum-mechanical description of physical 
phenomena. On the other hand, relation (2) is known to be satisfied in quite 
another context, viz. as a relation between the uncertainty of energy and the 
mean lifetime of unstable particles. But unstable particles are not described 
satisfactorily within the framework of quantum mechanics. They may be 
described consistently only within the framework of quantum field theory 
where the number of particles is observable which does not need to be a 
constant of motion. As is well known, it is quantum field theory but not 
quantum mechanics that may be truly reconciled with relativity and so the 
accord of relation (2) with quantum field theory as well as the disaccord with 
quantum mechanics seem to be explicable. 

The above excuse for the appearance of serious difficulties with the problem 
of relation (2) in quantum mechanics is not convincing. The ordinary quantum 
mechanics of a single particle may be regarded as a limiting case of quantum 
field theory in the low-energy region and in the subspace of one-particle states 
(as the number of massive particles becomes constant in the low-energy limit). 
Thus, if relation (2) holds true (in a certain sense) in quantum field theory, it 
should also remain valid in the above-mentioned limit. Indeed, relation (2) 
does not involve the magnitude of the mean value of energy and should be valid 
also in the low-energy limit. 

We may present still another argument against the view that the non- 
relativistic form of the ordinary quantum mechanics is to be blamed for the 
difficulties appearing in connection with the problem of relation (2). It is a 
common feature of relativistic theories that 'fourth' relations (completing some 
three-dimensional relations known from the pre-relativistic physics) are often 
only formally analogous to their three-dimensional counterparts whereas their 
meaning and interpretation are different. Let us illustrate this statement with 
an example: In the relativistic extension of Newtonian mechanics there appears 
a fourth equation of motion of a point particle, formally quite similar to the 
ordinary three equations. But its physical and even mathematical sense is quite 
different: it does not introduce any new degree of freedom, does not increase 
the number of independent equations of motion because it is dependent upon 
the usual three equations of motion and relates the energy change to the work, 
i.e. expresses the law of conservation of energy. In the non-relativistic limit this 

16 Uncertainty Principle and Foundations of Quantum Mechanics 

fourth relation does not disappear because energy is conserved also in the 
non-relativistic dynamics. Similarly, if in a relativist* theory the existence of a 
fourth relation (2) is to be expected, this relation should appear also in the 
limiting case of a non-relativistic theory, although its motivation and its 
physical meaning do not need to be similar to those of the usual relations (1) 

In order to show that the uncertainty relations (1) as well as (2) must hold 
true it is not necessary to invest the whole machinery of quantum mechanics but 
one may limit oneself to a consideration of de Broglie wave packets Let us 
consider a function f(x) in R t and define the dispersion (uncertainty) of x with 
respect to /(x) in the usual way 

Ax = (/,[x-(/,x/)] 2 /) 1/2 (5) 


(f,A-f) = \dxf*Af 

Let us consider the Fourier transform of the function /(x) 

1 r|dx/(x^"" 

g(fc) = 






A well-known mathematical theorem (see e.g. Heisenberg, 1930) says that the 
minimum of the product of dispersions Ax • Afc is obtained if fix) is of a 
Gaussian shape 

/(x) = 7 A 7 5exp(-x 2 /(Ax) 2 ) 

Then also g(fc) is of a Gaussian form 


g(fc) = 7T J a72exp(-fc 2 /(Afc) 2 ) 




where Ax and Afc appearing in (8) and (9) are identical with the dispersions 
defined according to (5). Moreover, their product is shown to be 

Ax- Afc =4 < 10 > 

This is a mathematical fact, quite independent of the meaning of the 
variables x and fc. By identifying x with a Cartesian coordinate of a particle in 
the ordinary space and fc with the inverse of de Broglie's wave length divided by 
2tt, so that fc = 2tt\' 1 one gets the usual uncertainty relation between the 
Cartesian coordinate and momentum (expressed in such units that h - 1). But 
we may as well replace x by t and fc by the frequency <o which yields 

Af-Aw=5 or At-AE = - ( n ) 

Thus, the fourth uncertainty relation (2), or more precisely (11), is a direct 
consequence of the wave aspect of matter. 

Rayski and Rayski Jr. 17 

While the existence of relation (11) is beyond any doubt the problem of 
its interpretation still remains open. Let us stress once more that the following 
interpretation : 'the information about the value E of the energy of a particle 
and the information about the instant t at which it possessed this amount of 
energy are incompatible unless both informations are subject to uncertainties 
AE and A/ whose product is not smaller than 'ft/2' is incorrect. In order to find 
out a correct interpretation of the fourth uncertainty relation let us come back 
once more to a discussion of the ordinary uncertainty relations between 
position and momentum. In this case it is also incorrect to say simply that these 
relations mean an impossibility of surpassing the exactitude of information 
about momentum and position of a particle beyond the limits imposed by the 
formulae (1). This last statement is not correct because one can measure the 
position of a particle first (say at t t ) with an arbitrary exactitude and afterwards 
(say at t 2 ) measure its momentum also with an arbitrarily high precision so that 
in the interval (r x , t 2 ) bounded by the two instants of measurements the 
exactitude of our information about position and momentum surpasses, 
indeed, the limits imposed by the relations (1). 

Heisenberg's uncertainty relations, if correctly understood, mean something 
else, namely the following two facts: (a) A simultaneous direct measurement of 
coordinate and momentum of a particle with an exactitude surpassing the limits 
(1) is impossible, (b) If the two measurements were performed consecutively 
then only the result of the latter may be used for probabilistic predictions of the 
future while the result of the former measurement becomes completely disac- 
tualized* and invalidated due to the uncontrollable disturbance of the particle 
by the latter measurement. 

Thus, the point of utmost importance as regards the correct interpretation of 
(1) is that it determines the limits for an accuracy of simultaneous (i.e. at a fixed 
instant t ) measurements of x and p and, consequently, for a maximal precision 
of prescribing the initial values of the parameters of the system that are 
necessary for the computation of its temporal development. 

Substituting x by t and p by E we also must not forget to perform suitable 
substitutions in the interpretational comments: Exactly as (1) is valid for a fixed 
value t = t , the relation (2) must be valid for a fixed value x = x , otherwise the 
analogy of the two uncertainty relations (in a two-dimensional space-time) 
would be incomplete and would lead us astray. But what does it mean that the 
relation (2) applies to a fixed point x = x ? Obviously, it means that if one is 
observing the particle (represented by a wave packet with a given AE, to pass 
the point x = x during its propagation along the x-axis then one is unable to 
say when it will pass the point x = x with an exactitude greater than Af = h/ AE. 
The more exact is the knowledge of the particle energy the less exact is the time 
instant of passing (of this particle) by a fixed point on the x-axis and vice versa: 
the more exactly we know the instant at which a particle passed by an arbitrary 
out fixed point (on the x axis) the less exact must be our knowledge about its 

t remains valid for probabilistic retrodictions of the past. (See J. Rayski, 1973). 

18 Uncertainty Principle and Foundations of Quantum Mechanics 

energy. Such is the proper sense of the time-energy uncertainty relation for a free 
particle in a two-dimensional space-time. To our knowledge, such interpreta- 
tion has not been stated explicitly in any of the extremely numerous scientific 
articles and textbooks on quantum theory. 

Going over from two- to a three-dimensional space-time the fixed point 
turns over into a fixed line, and going over to a four-dimensional space-time it 
becomes a surface. In this case the fourth uncertainty relation may be inter- 
preted as follows: At means the uncertainty of the instant when the particle will 
cross this surface. The product of this uncertainty At and the uncertainty of 
energy AE cannot be smaller than jh. 

The above-mentioned surface may be a closed surface constituting the 
boundary of a three-dimensional domain whose volume V may be assumed to 
be finite, and we may ask about the instant when a particle will cross this surface 
and enter the domain in question. Again, the knowledge of the instant of 
crossing this boundary by an ingoing particle cannot be made certain beyond 
the exactitude imposed by the uncertainty relation (2). 

The question about ingoing particles which enter a given domain by passing 
from its exterior into its interior across its surface is a problem of boundary 
conditions. In quantum mechanics one usually considers either wave functions 
in the whole space or in a finite domain but with non-penetrable walls. In 
neither case does the problem of how many particles and when, enter or leave 
the domain in question appear. But it is a very natural problem to consider a 
finite domain in space and to ask for a solution of the Schrodinger equation in 
this domain under given initial conditions (say at f = 0) and under some 
boundary conditions determining the ingoing waves, i.e. the waves crossing the 
surface of the domain into its interior (for t > 0). Such mixed boundary-initial 
conditions determine uniquely the solution in this domain f or t > and enable 
one to compute the outgoing waves crossing the surface of the domain from its 
interior to its exterior. This is the most natural approach to a description of 
scattering phenomena occurring in a finite domain. 

Now, whereas the initial conditions at t = have to be consistent with the 
ordinary uncertainty relations between coordinates and momenta, the 
boundary conditions for f >0 must be consistent with the time-energy uncer- 
tainty relation : The knowledge of when an ingoing particle enters the domain in 
question and the knowledge of its energy are subject to uncertainties satisfying 
the relation (11). 

In conclusion it may be stated that the ordinary uncertainty relations are 
related to the initial value problems at a space-like hypersurf ace whereas the 
energy-time uncertainty relation is connected with the boundary problems on 
closed time-like hypersurfaces, e.g. abstract (i.e. freely penetrable) walls 
restricting a finite domain during a finite or infinite time interval. 

Hitherto in our discussion we tacitly assumed wave packets describing free 
particles. This fact reminds us of an objection raised by Eberly and coworkers 
(1973) in a footnote to their article. We quote: 'The conventional understand- 
ing is essentially dichotomous. That is, the uncertainty times associated with 

Rayski and Rayski Jr. 19 

wave packet spreading and with excited-state decay are regarded as unrelated 
consequences of the uncertainty principle. This point of view is apparent in 
every quantum mechanics text known to the authors.' The question arises 
whether this objection applies also to our understanding of the energy-time 
uncertainty relation. 

The first reason for the appearance of this dichotomy is simply the fact that 
quantum mechanics is principally unable to describe unstable systems. There- 
fore the relation (2) applied to the lifetimes of unstable systems and the 
uncertainty of their rest masses can be applied only to quantum field theory 
where the numbers of particles are not constants of motion. But assuming 
quantum field theory we may ask the following question: 

How can we know that an unstable particle has undergone a decay? 
Obviously by surrounding a macroscopic domain D in the interior of which the 
particle is situated by detectors in order to register the decay fragments 
outgoing from the domain through its boundary (equipped with detectors). But 
this is just a particular case of the above described boundary-initial problems: 

At the initial instant t we assume the presence in the domain D of an 
unstable system characterized by an uncertainty of energy AE. As for the 
boundary condition we assume that no particles will penetrate into the domain 
from its exterior at t > t . We look for outgoing waves of the decay fragments. 
According to our previous discussion the time of crossing the boundary by the 
decay fragments must remain uncertain within At — h/AE. But the uncertainty 
of the instant of escaping from the domain is related to a similar uncertainty At 
of the decay instant. Thus, the time of decay counted from an arbitrary initial 
time instant t (when the system was known to be still a bound state) could be 
anything within the interval (t , t + At). Consequently, the mean lifetime of the 
system is something like one half of At and the product At • AE is, indeed, of 
the order of magnitude of Planck's constant. 

Let us remark that the boundary condition consisting of an assumption that 
no particles enter the domain for t > t was necessary because otherwise we 
would have to deal with an induced decay which might affect considerably the 
mean lifetime of the unstable system. 

From the above discussion it is obvious that the two uncertainties: one 
connected with the spreading out of wave packets representing free stable 
particles and the other connected with the problem of the lifetime of unstable 
systems are not dichotomous and the objection of Eberly and Singh does not 
apply to our interpretation of the energy-time uncertainty relation. 

Our explanation of the fourth uncertainty relation may be summarized as 
follows: if there existed a well defined 'time-operator' canonically conjugate to 
the Hamiltonian then the fourth uncertainty relation would be independent of 
the usual ones. But it is not the case. Time does not need to be and, in fact, is not 
an operator but a mere parameter. However, similarly as the fourth equation of 
Newton in relativistic mechanics is not independent from the remaining three 
equations but is a consequence of them, also the fourth uncertainty relation 
exists and is a straightforward consequence of the remaining three relations 

20 The Uncertainty Principle and Foundation of Quantum Mechanics 

and the wave character of particles. The proper interpretation of the energy- 
time relation is connected with the problem of boundary conditions in quite an 
analogous way to that in which the usual uncertainty relations are connected 
with the problem of initial conditions. In particular, the uncertainty At is 
related to the problem of when a particle is penetrating across a given surface. 


Carruthens, P. and Nieto, N. M. (1968) Rev. Mod. Phys., 40, 411. 

Eberly, J. and Singh, L. P. S. (1973) 'Time operators, partial stationarity, and the energy , Phys. 

Rev.D, 7, 359-362. . . . 

Heisenberg, W. (1930) Die Physikalischen Prinzipien der Quantentheone, S. Huzel, Leipzig. 
Rayski, J. (1973) "The possibility of a more realistic interpretation of quantum mechanics , 

Foundations of Phys., 3, 89-100. 


A Time Operator and the Time-Energy 
Uncertainty Relation 


University of Catania, Italy 


In nuclear physics and in elementary particle physics (at low energies) it is usual 
to have recourse only to monochromatic plane waves and to the time- 
independent formulation of quantum mechanics. 

With the aim of making quantum mechanics as 'realistic' as possible, let us on 
the contrary adopt a space-time description of the collision phenomena, by 
introducing wave packets. Notice that, even when dealing with many wave 
packets, it is not necessary at all to have recourse to unphysical, multidimen- 
sional spaces. On the contrary, if we want to preserve the individuality of the 
considered packets, we must just supply a temporal (realistic and physical) 
description of them within the ordinary, three-dimensional space. 

As soon as a space-time description of interactions has been accepted, one 
can immediately realize, even in the framework of the usual wave-packet 
formalism, (Olkhovsky and Recami, 1968, 1969) that a quantum operator for 
the observable time is operating. Namely, it is implicitly used for calculating the 
packet time-coordinate, the night-times, the interaction-durations, the mean- 
lifetimes of metastable states and so on (Recami, 1970; Olkhovsky and 
Recami, 1970; Baldo and Recami, 1969; Olkhovsky, 1967). A preliminary, 
heuristic inspection of the formalism (Olkhovsky and Recami, 1968, 1970) 
suggests the adoption of the following 'operators' (Olkhovsky and Recami, 
1970; Baldo and Recami, 1969) 

f i = — *■ 


' 2 ~~2dE' [E=E «* ] 


acting on a wave-packet space which we must carefully define [because of the 
differential character of the 'operators' (1)]. 


22 Uncertainty Principle and Foundations of Quantum Mechanics 

Let us first consider, for simplicity, a free particle in the one -dimensional case, 
i.e. the packet: 

F(t, x)=\ dp- F(E, p) - exp [ i (px - Et)] 



where h= 1 and E=p 2 /m . The integral runs only over the positive values 
owing to the 'boundary' conditions imposed by the initial (source) and final 
(detector) experimental devices. Notice that, in so doing, we chose as the frame 
of reference that one in which source and detector are at rest: i.e. the laboratory 
reference frame. In particular, notice that we are considering for simplicity the 
case of source and detector at rest one with respect to the other. 

Let us now observe that the packet (average) position is always to be 
calculated at a fixed time t = i; analogously, the packet time-coordinate is 
always to be calculated (by suitably averaging over the packet) for a position 
x = x along a particular packet-propagation-ray. Therefore, in our case we can 
fix a particular x = x, and restrict ourselves to considering, instead the packets 
(2), the functions: 

F(t, x) = f °° dp • f'(p, x) • exp 1-iEi] = f AE • f(E, x) • exp [-iEt] 

Jo J (+) 


where E - £ tot = E kin = p 2 / m ; and /'=/-d£/d|p|. Functions F(t,x) and 
f(E, x), being only functions either of t or of E, respectively, are neither wave 
functions (that satisfy any Schrodinger equation), nor do they represent states 
in the chronotopic or four-momentum spaces. Let us briefly set: 

F = F ( t ) = F(t, x); f=f(E)=f(E,x) 


It is easy to go from functions F, or/, back to the 'physical' wave packets, so that 
one gets a one-to-one correspondence between our functions and the 'physical 
states'. We shall respectively call «space f » and ((space £» the functional spaces 
of the F's and of the transformed functions/, with the mathematical conditions 
that we are going to specify. In those spaces, for example, the norms will be: 

|F|-f \F\ 2 df, ||yH-f|/| 2 d£ 

J-oo J o 


In any case, due to equations (3), the space t and the space E are representa- 
tions of the same abstract space P, where we indicate 

F+\F); f-*\f) 


where \F) = \f). For reasons which we shall see later, let us now specify what has 
previously been said by assuming that space P is the space of the continuous, 

Recami 23 

differentiable, square-integrable functions / that satisfy the conditions: 

a/ 2 

r r 

l/| 2 d£<oo; 
J o Jo 



; j \f\ 

2 E 2 dE«x> (5) 

Such a space is dense (von Neumann, 1932) in the Hilbert space of L 2 functions 
defined over the interval < E < oo. 


Still within the framework of the usual quantum mechanics with wave packets, 
let us define in the most natural way: 

<'(*)> = 


; P-I^l 2 


Then we can immediately calculate that 

(t(x)) = (F\t\F) = ±-\ [F*tF]dt 

iV J— oo 

where N is the normalization factor, and verify that 

(F\t\F) = (f\-^\f) 




This would suggest adopting as the time 'operator' the bilinear derivation 

"'2— {jjS UN 

By easy calculations, one realizes that we can also adopt the (standard) operator 


t = u=-i 



even if at the price of imposing on space-P functions the subsidiary condition 
/(0, x) = 0, which is not fully desirable from a physical viewpoint. Since for 
using bilinear derivation (10a) as a (bilinear) operator a new formalism should 
be introduced (Olkhovsky and Recami, 1970), let us prefer here the time 
operator (10b). 

24 Uncertainty Principle and Foundations of Quantum Mechanics 

Our operator (10b) has many good properties as listed below. 

(1). Equation (9) shows that, in the space t, it reduces— as is very natural— to 

the mere multiplication by t. 
(2). Relations such as equation (8) become physically clear when written: 

whence, in accordance with the Ehrenfest principle, it follows that: 

(t) = t +x/(v) (ID 



When we pass to a new frame of reference, source and detector will no 
more be at rest: However, only the packet properties relative to the 
detector (and to the source) will still be essential. This is enough to 
secure the Galilean invariance of our operator. 
In the impulse representation, one meets the interesting correspon- 
dence (ft = 1): 

i 8 nto\ 

L P P J 2 



where the last addendum vanishes when ft-»0. 
(5). We have seen already that the space of the (continuous, differentiable) 
functions satisfying conditions (5) is dense in the Hilbert space of L 
functions defined over the interval 0<E <oo. Firstly equation (5) is the 
condition for square integrability. Secondly equation (5) requires that 
our operator (10) transform Hilbert-space vectors into Hilbert-space 
vectors. Thirdly equation (5) requires that in our space a 'good' energy- 
operator can be denned. 

It is easy to verify that our operator (10b) is canonically conjugated 
(Heisenberg, 1944) to the {total) energy: 

[f,E] = -ift 


(6). Under conditions (5), one gets that: 

f7ff/ 2 dE=f (f/i)*/2dE 
Jo J o 


i.e. that our time operator is not only Hermitian, but also symmetric, 
according to the usual mathematical terminology (Akhieser and Glads T 
man, 1954). 

Recami 25 

(7). Having now the time operator (10b) at our disposal, we can immediately 
obtain — through the standard procedure (see, for example, Caldirola, 
1966) — the uncertainty correlation: 





In our opinion, equation (15) means that in general the uncertainty AE 
that one meets when measuring the energy £ of a particle is tied to the 
duration of the actual measurement interaction by relation (15). For 
example, let us suppose that we are measuring the energy of a particle 
by observing its track in a bubble chamber. If we examine (by means of a 
photograph) a long track segment, we will be able to have good 
'statistics' in counting bubbles, and therefore a good determination of 
the (average) energy of the particle while producing that track; but the 
time instant at which the particle possessed that energy will be known 
with a large uncertainty. Vice versa, if we examine a short track 
segment, then we shall get a good time measure, but at the price of poor 
bubble-statistics (see Figure 1). In this example, the experiment — or 
better the measurement — is the track-segment examination. 

Figure 1 Track of particle in a bubble chamber 


When passing to the non-free case, things do not essentially change. Let us 
consider, for example, the case of the scattering of a (spin free) particle by a 
central potential V(r). Inside the potential region, we have packets of partial 
/-waves, distorted by the potential (Calogero, 1967). By the introduction of 
S2 n functions ^ ,n) ('> V, ^° ut) ('> f), and of the transformed ones B? n \p, r), 
B i (p, r) (Olkhovsky, Recami and Gerasimchuk, 1974), the time durations 
are still got by using operator (10); and one will still write: 


\ I 2dE\ / in)0ut 
Analogously, also equation (13) is still valid, and so on. 


26 Uncertainty Principle and Foundations of Quantum Mechanics 

In the particular case of metastable states (Olkhovsky and Recami, 1968; 
Olkhovsky, 1968; Recami, 1970), let us admit that V(f) m for f > R, quantity 
R being the potential radius (see Figure 2). Let us analyse the process: free 
initial flight; unstable state formation; and decay with subsequent free final 
flight. Let us calculate the time r, spent by the particle (or better by its / partial 
wave) inside a sphere with centre in the potential centre and with radius r > R. 

r> R 

Figure 2 The scattering of a particle by a central potential 

When in the presence of a resonant elastic scattering, we have: 
.E-Eo-iY - -- . T 

s, = s t 


5 ( -S/-arctg 



where S, and 8, are smoothly varying functions in the 'resonance' region. In the 
narrow resonance approximation, for sufficiently large values of r one obtains: 

r / -2r<t;- 1 )+- 


Analogously, one can calculate the duration of the interaction (Olkhovsky 
and Recami, 1968) — or of partial interactions (Olkhovsky and Recami, 
1969) — i n a two-wave packet collision (Olkhovsky, Sokolov and Zaychenko, 

In particular, it seems useful to calculate the interaction duration, <Ar) int , 
corresponding to the cross-section enhancements: the necessary condition for 
the peak to be associated with a true resonance will be that <Af ) int also has a 
maximum at the considered energy. 


After what we have seen of the good behaviour of our operator (10), we can ask 
ourselves why a time-operator was not introduced in standard quantum 

Recami 27 

mechanics, even if quantum mechanics is typically built up by associating an 
operator to every observable. The reason is that operator (10), defined as 
acting on the space P, does not become hypermaximal (von Neumann, 1932), 
because of the fact that P is a space of functions defined only over the interval 
0<E<oo and not over the whole E-axis. It follows that f, while being 
Hermitian and symmetric, is however not self-adjoint, and does not allow 
identity resolution. Essentially because of these reasons, Pauli (1958) objected 
to the use of a time-operator, and this had the effect of practically stopping 
studies on the subject. 

Von Neumann himself, however, had claimed — followed by other authors 
(e.g. Engelman and Fick, 1963, 1964, 1959; Razavy, 1969, 1967; Landau and 
Lifshitz, 1963; Aharonov and Bohm, 1961; Papp, 1971, 1972; Rosenbaum, 
1969) — that considering in quantum mechanics only self-adjoint operators 
could be too restrictive. This is our conviction: In fact, even if operator i does 
not admit true eigenfunctions, nevertheless we succeeded in calculating the 
average values of i over our functions (and over the physical 'packets' 
corresponding to them). And that is enough for us. That is also the reason why, 
after equations (10), we have often written the bilinear form (10a) instead of 
the standard operator (10b). 

To clarify the problem, we shall quote an explanatory example (von 
Neumann, 1932): Let us consider a particle Q, free to move in a semispace 
bounded by a rigid wall (see Figure 3). We shall then have 0<jc <oo. Conse- 
quently, the impulse x -component of Q, which reads 

Px - —i — 
y dx 


will be a non-hypermaximal, non-self-adjoint (but only Hermitian, symmetric) 
operator, even if it is an observable and has a simple physical meaning. 



0<x< oo 

a ,• d 


Figure 3 A particle free to move in a semispace bounded by a rigid wall 

The author acknowledges that the core of the .present matter was essentially 
developed in collaboration with Professor V. S. Olkhovsky, and he is also 

28 Uncertainty Principle and Foundations of Quantum Mechanics 

grateful to Professor M. Toller for very useful criticism. His thanks are due to 
Dr. S. S. Chissick, Dr. A. I. Gerasimchuk and Dr. E. Papp for their very kind 


Aharonbv, Y. and Bohm, D. (1961) Phys. Rev., 122, 1649. 

Akhieser, N. I. and Gladsman, I. M. (1954) Theorie der Uneaten Operatoren in Hilbert Raum, 

Akademie Verlag, Berlin. 
Baldo, M. and Recami, E. (1969) Lett. Nuovo Omenta, 2, 643. 
Caldirola, A. (1966) 'Istituzioni di Fisica Teorica', Ambrosiana, Milano. 
Calogero, F. (1967) Variable Phase Approach to Potential Scattering, Academic Press, New York. 
Engelman, F. and Fick, E. (1959) Supplem. Nuovo Omenta, 12, 63. 
Engelman, F. and Fick, E. (1963) Z. Phys., 175, 271. 
Engelman, F. and Fick. E. (1964) Z. Phys., 178, 551. 
Heisenberg, W. (1944) Die Physicalischen Prinzipien der Quantumtheorie, 4th ed., Hirzel, 

Landau, L. D. and Lifshitz, E. M. (1963) Kvantovaya Mekhanika, Nauka, Moscow. 
Olkhovsky, V. S. (1967) Nuovo Omenta, 48 B, 170. 
Olkhovsky, V. S. (1968) Ukr. Fis. Zh., 13, 143. 
Olkhovsky, V. S. ana Recami, E. (1968) Nuovo timenta, 53 A, 610. 
Olkhovsky, V. S. and Recami, E. (1969) Nuovo Omenta, 63 A, 814. 
Olkhovsky, V. S. and Recami, E. (1970) Lett. Nuovo Omenta, 4, 1165. 
Olkhovsky, V. S., Recami, E. and Gerasimchuk, A. I. (1974) Nuovo Omenta, 22 A, 263. 
Olkhovsky, V. S., Sokolov, L. S. and Zaychenko, A. K. (1969) Soviet J. Nucl. Phys., 9, 114. 
Papp, E. (1971) Nuovo Omenta, 5 B, 119. 
Papp, E. (1972) Nuovo Omenta, 10 B, 69, 471. 
Pauli, W. (1958) Handbook derPhysik, Fliigge, S. Ed., Vol. 5/1, p. 60 last ed., Springer- Verlag, 

Razavy, M. (1967) Am. Joum. Phys., 35, 955. 
Razavy, M. (1969) Nuovo Omenta, 63 B. 271. 
Recami, E. (1970) Ace. Naz. Lincei, Rendic. Sc., 49, 77 (Rome). 
Rosenbaum, D. M. (1969) /. Math. Phys., 10, 1 127. 
Von Neumann, J. (1932) Matematischen Gnmladen der Quantum Mechanik, Hirzel, Leipzig. 

Quantum Theory of the Natural Space-Time Units 


Polytechnic Institute of Cluj, Romania 


For more than 50 years the quantum-mechanical space— time description 
problem has aroused justified interest and has given rise to great power for 
insight. Overcoming difficulties, phycisists have investigated this subject ini- 
tially from certain points of view, and reinvestigated it subsequently with 
respect to a relatively more evolved context. The history of the space-time 
quantization represents in fact the most significant and profound aspect of the 
history of quantum theory itself. Throughout the years attempts have been 
made to analyse, though only provisionally, the peculiarities of a common 
quantum-mechanical description of space-time and matter, and space-time 
quantization has come to be regarded as one of the fundamental problems in 
the scientific understanding of nature. 

The conceptual new content of quantum mechanics is expressed by the 
explicit recognition that measurements cannot be objectively performed with 
indefinitely increasing accuracy. In these conditions we have to consider the 
existence of the ultimate (non-zero) accuracy of the space-time measurements. 
This ultimate accuracy principally results from the new role of the measuring 
apparatus as a physical object which is itself constituted from the really existing 
microparticles. Generally, the microparticles have to be considered neither as 
points, nor with a rigorously spatial extension. This assumption is supported by 
a certain structure of the physical microparticle and vice versa. Considering the 
microparticle coincidences as the elementary acts of the space-time measure- 
ments, there results the existence of an intrinsic space-time allowance (March, 
1941). This allowance is able to offer by itself the possibility of defining — now 
in a natural way (Bohm and coworkers, 1970) — the existence of the natural 
space-time units. In this respect the quantum-mechanical space-time measur- 
ing process can be considered as the counting process of the successive 
elementary coincidences. Moreover, the structure of quantum mechanics as a 
proper physical theory, with a well-established form, can be generally deduced 
from the laws of the measurements (Ludwig, 1972). 

It now becomes necessary for the mathematical formalism of quantum 
mechanics to be explicitly in agreement with the existence of the non-zero 


30 Uncertainty Principle and Foundations of Quantum Mechanics 

space-time imprecisions. For this purpose a suitably extended quantum- 
mechanical formalism is needed which has to contain the space-time impreci- 
sions as fundamental entities. Such a formalism has also to permit the consis- 
tent definition of the natural space-time units as certain lower bounds of the 
space-time imprecisions. In this sense account has to be taken of the existence 
of certain profoundness levels in the quantum-mechanical description of the 
microparticles: atoms, nuclei and elementary particles. One would then expect 
to obtain the Bohr radius, the Compton wavelength and the electron radius as 
the natural units for atoms, for free elementary particles and for (interacting) 
electrons, respectively. There is also the Planck radius which has to be the 
natural space constant of the gravitational field. 

Such a quantization programme can be materialized by a suitable use of the 
binary description formalism. In this sense the complex numbers (a - j'j8) — and 
not exclusively the real ones — are allowed to describe the results of the 
measurements (Kalnay and Toledo, 1967). The result of the measurement is 
now expressed — in a relatively more complete manner — by the pair (a, 0) of 
real numbers, and alternatively by the interval [a - 0, a + /3] (or (a - /3, a + 0)) 
on the real a axis. This segment is non-equivocally defined by the complex 
number. In this respect the space imprecision approach proposed by Flint 
(1948) is relatively 'incomplete' as he uses not an interval on the real axis, but 
only a translated point. The space-time imprecisions become now inner 
elements of the theory, as they are defined as the imaginary parts of the binary 
(non-Hermitian) operator averages (Papp, 1972a, b; 1973; 1974a, b, c). In 
such conditions Neumann's axiom which legitimizes the description of the 
physical observables only by hypermaximal operators is in fact not rejected, 
but extended (Fick and Engelmann, 1964; Olkhovsky, Recami and Gerasim- 
chuk, 1974). 

A short history of the space-time quantization problem will be presented in 
Section 2. The meaning of the binary space-time description will be analysed — 
in terms of the collision- time evaluations — in Section 3. In this way it is proved 
that the space-time imprecisions are able to express certain limitations on the 
accuracy of the space-time measurements. Section 4 is devoted to the defini- 
tion of the binary space-time operators and of the corresponding space and 
time imprecisions. There it is proved that the binary description of the 
space-time is mutually connected with the one of the action. In Section 5 the 
uncertainties of the binary space and time operators are evaluated. The 
high-energy approach to the space-time imprecisions will be performed in 
Section 6. The physical meaning of the electron radius, of the Compton 
wavelength, of the Bohr radius and of the Planck radius as natural units are also 

Except in self-evident cases, units will be chosen so that h = c = 1. 


Some opinions about the existence of the space-time quanta were expressed 
before the development of quantum mechanics (Poincare, 1913; Proca, 1928; 

Papp 31 

Kaluza, 1921). During the period of the main development of quantum 
mechanics the idea of an atomistic structure of space-time had been explicitly 
formulated by Thomson (1926), Levi (1926), Pokrowski (1928), Latzin (1927), 
Beck (1929), Schames (1933) and others. In this respect a preferential meaning 
had been attributed to the electron and/or nuclear radius. Attempts were 
proposed to define a theory of the physical constants as a consequence of the 
existence of the space-time quanta and of the upper value of the elementary 
particle rest-mass (Beck, 1929; Schames, 1933). An essential step to take for 
assuming the existence of the ultimate accuracy of the space-time measure- 
ments in agreement with the mathematical formalism of quantum mechanics 
has been stimulated and supported by the Heisenberg (1927) uncertainty 
relations. In these conditions a conceptually more general approach to the 
space-time quantization has been formulated on the basis of the existence of 
the elementary space (h/m c) and time (h/m c 2 ) uncertainties by Ruark 
(1928), Hint and Richardson (1928), Fiirth (1929), Wataghin (1930), Landau 
and Peierls (1931), Glaser and Sitte (1934) and Hint (1937). It is also the 
amplitude of the Zitterbewegung (Schrodinger, 1930) which has been inter- 
preted as the- result of the existence of the individual space imprecision 
(Iwanenko, 1931). During this period fundamental problems concerning the 
connection between the structure of the elementary particles and the existence 
of the space-time quanta (Furth, 1929; Glaser and Sitte, 1934), the necessity of 
the synthesis between gravity and quantum theory (Fock and Iwanenko, 1929; 
Wataghin, 1932; Glaser and Sitte, 1934; Flint, 1935, 1937), the necessity of a 
more deep connection of electromagnetism and physical space-time descrip- 
tion (Flint, 1935; Moglich and Rompe, 1939) were analysed and discussed. 

Further progress in the analysis of the time-energy uncertainty relations are 
due to Madelstamm and Tamm (1945), Fock (1962), Fujiwara (1970) and 
Olkhovsky and Recami (1970), whereas certain objections concerning the 
meaning of the time-energy uncertainty relations were raised by Aharonov 
and Bohm (1964) and Bunge (1970). The space uncertainties have been 
evaluated for bound states by Remak (1931), and subsequently calculated for 
the interacting particles by Griffith (1974). A discrete space-time method has 
been used to evaluate the space-time uncertainties for the relativistic particles 
(Henning, 1956). Relativistic space uncertainties were calculated for fermions 
by Blokhintsev (1973). The uncertainty relations have been applied to the 
gravitational field by Peres and Rosen (1966) and Wheeler (1957) and to the 
electromagnetic field by Jordan and Fock (1930), Landau and Peierls (1931) 
and others. There is also the uncertainty-time operator which has been 
explicitly proposed for bound states by Eberly and Singh (1973). 

To overcome the divergence difficulties of the present quantum field theory, 
to suitably define the high-energy production processes and also to favour the 
development of the theory for predicting the elementary particle rest-masses 
attempts were proposed to introduce a fundamental length in the quantum 
theory by Heisenberg (1936, 1938a, 1938b, 1942), March (1936, 1937a, b, c), 
Ambarzumian and Iwanenko (1930), Markov (1940) and others. Generally 
this fundamental length has to take the value of the particle size. In agreement 

32 Uncertainty Principle and Foundations of Quantum Mechanics 

also with the preferential meaning of the weak interactions (Heisenberg, 
1938a) there are Kadyshevsky (1961) and Kim (1973) who have analysed 
explicitly the space constant of the weak interactions. In the papers cited above 
March advocates the necessity of a suitable redefinition of the short-distance 
geometry. In this sense certain contributions were also given by Wheeler 
(1957, 1962), Coish (1959), Takano (1961), Blokhintsev (1960, 1973), de Witt 
(I960) and others. To support the existence of the fundamental length, 
space-quantization approaches were performed by Snyder (1947) and Hellund 
and Tanaka (1954). In their approaches the space quantization is the result of a 
discrete space eigenvalue problem. Curved space approaches to the space 
quantization were also proposed (Yang, 1947; Flint, 1948). The compatibility 
between Lorentz invariance and the existence of the discrete space-time 
quanta has been analysed by Schild (1948) and Hill (1955). An alternative 
approach to the space quantization has been proposed by Darling (1950) who 
considers the irreducible volume character of events. Concerning the (cellular) 
discrete-space approach we have to mention the contributions given by Das 
(1960) and Peters (1974). It is significant that the present day non-linear, 
non-local, indefinite metric and higher derivatives field theories support in one 
way or another the existence of the fundamental length (see e.g. Vialtzew, 
1965). There is also evidence for considering that the predictions formulated 
earlier by Heisenberg (1938a) concerning high-energy explosions are qualita- 
tively in agreement with the present day multiparticle production processes. 

Meaningful results were obtained by March (1941) in the description of the 
quantum-mechanical space-time measuring process. In this sense we have to 
consider that the quantum-mechanical measuring apparatus is essentially more 
complex and rather distinct from the one for relativity (March, 1937a). In this 
respect a first step in order to conceptually join relativity and quantum 
mechanics is to consider the reference frame as a component of the quantum- 
mechanical measuring apparatus (Wataghin, 1930). Certain difficulties con- 
cerning the co-existence of the fundamental length with the standard Lorentz 
invariance (Pavlopoulos, 1967) can be, at least qualitatively, overcome e.g. 
within the extended Lorentz invariance condition used by Schild (1948) and 
Hill (1955). However, there is evidence to conclude that the general theory of 
relativity is essentially more suitable for describing the extended particles than 
special relativity. These latter aspects were analysed by Motz (1962, 1972), 
Markov (1965, 1966), Penrose and MacCallum (1973), Sivaram and Sinha 
(1974), Lord and coworkers (1974) and others. In this connection the quantiza- 
tion of the gravitational field is of a special interest (see e.g. Wheeler, 1957; 
Treder, 1963; Brill and Gowdy, 1970). We may thus conclude that the 
problems raised by the space-time quantization have in fact not lost interest 
and opportunity since the appearance of quantum mechanics. 

Progress was also obtained in the definition of the space-time operators as in 
the performing of the collision time evaluations (see e.g. Kalnay, 1971; 
Almond, 1973). It has been proved that further developments of the quantum- 
mechanical space-time description needs the extension of the standard 


Papp 33 

quantum-mechanical formalism (Fick and Engelmann, 1964; Kalnay and 
Toledo, 1967; Broyles, 1970; Olkhovsky, Recami and Gerasimchuk, 1974 and 
others). There is also an increasing interest in analysing more deeply certain 
aspects of the very quantum-mechanical ('shell-pulsating') free-particle 
description (see e.g. Dirac, 1972). All the facts presented above permit us to 
assume that the quantum-mechanical space-time description — which is far 
from being completely resolved — is a fundamental problem characterizing all 
the steps in the evolution of the quantum theory. 


The time spent by the outgoing (reduced) particle in the interaction region is 
given for / = by (Smith, 1960) 

o(a,p) =— +2^-S (p)-^-sm2[pa+S (p)] (1) 

v do> 2o) 


so that the collision time-shift is 

r't\a,p)^r {a,py 

— = 2^-8 (p)-^-sm2[pa+8 (p)} (2) 

v do> 2<>) 

where a is the interaction radius, &> = p /2m and S (p) the phase-shift for 
/ = 0. In agreement with the formal scattering theory we shall consider that the 
phase-shift does not explicitly depend on the interaction radius. In order to 
eliminate by all means the presence of the oscillating term a supplementary, 
outside of the theory, averaging device with respect to the interaction radius 
has been imposed by Smith (1960), Jauch and Marchand (1967) and Gien 
(1965). This averaging device is not only artificial but also physically meaning- 
less. On the contrary, the presence of the oscillating term has to be maintained 
in order to preserve the macroscopic causality condition (Wigner, 1955) and 
especially the causal positivity of the interaction-time evaluation (Papp, 1972a; 
Baz, 1966; Peres, 1966). In this respect we can already suppose that the 
presence of the oscillating term presents a fundamental theoretical meaning. 
Indeed, the interaction radius is not uniquely defined. More exactly, if a is 
the interaction radius, there is the larger value a'>a, too. In these conditions 
we can cause, at least formally, the last term of the time-shift (2) to oscillate, so 
that the punctual collision-time evaluation 2(d/do>) S (p) is in fact replaced by 
the interval 

1 2-4-«p)+-M (3) 

>£ 8 ° ( ')- 

2ft>' dw 

The width of this interval is independent from the dynamical peculiarities of the 
collision system. We can thus conclude that the actual observable meaning of 
the average 2((d/d<w) S (p)) can be suitably defined only within a certain range, 

34 The Uncertainty Principle and Foundation of Quantum Mechanics 

whose largest value is given by (l/2«>. In these conditions we have to consider 
that the real purpose of the quantum-mechanical description is a double one: to 
perform the observable time-shift evaluations and to state theoretically the 
existence of an objective degree of accuracy of the time-shift measurement. 
The above results express the essential step in the definition of the binary 
description formalism. In this sense the binary interval (3) describes the 
measurement in which the observable evaluation 2<(d/d<o) 8 (p)) is obtained 
within the imprecision <l/2o>>. The fact that the above imprecision is twice 
larger than the one of <l/4<») previously calculated (Papp, 1974a) can be 
explained noticing that the present imprecision does not refer to a single binary 
time-shift variable, but to the difference of two binary variables. We can also 
remark that the binary description formalism is consistent with the starting 
conditions concerning the necessity to assure the fulfilment of the macroscopic 
causality condition expressed by the positivity requirement of the interaction- 
time evaluation. Indeed, the inequality \a \ > /3 expresses both the macroscopic 
causality condition and the necessary condition the binary variable a - ifi to 
possess measurable meaning. 

The above discussions preserve their meaning in the relativistic case, too. 
Thus, the outgoing time-shift of the elastically scattered Klein-Gordon particle 
is given, in the one-dimensional case, by 


(a, p) = 2— 5 (p)+^§ sin 2[pa+ 8 (p)] 
apo P 


where p = Vp 2 + /no (Gien, 1965). Similarly to the non-relativistic case we are 
now able to define the existence of the relativistic binary time-shift description 
with the imprecision given by <p /2p 2 }. For the Dirac-particle one obtains the 


(a, p) = 2— S (p)-^r sin 2[pa+S (p)] 
apo P 


thus stating the existence of the particular time imprecision <m /2p ). In 
agreement with point (/) of the binary description formalism (Papp, 1973) we 
can see that the above obtained time imprecisions are binarily 'equivalent': 

2p 2 2p 2 


up to the threshold velocity (v3/2)c. 

We have to mention that a similar concourse of events to the above one arises 
when comparing the results obtained for the lower bound of the phase-shift 
derivative by Wigner (1955) and by Goebel, Karplus and Ruderman (1955), 
respectively. The results so obtained allow us to conclude that the space-time 
imprecisions are essentially inner elements of the quantum-mechanical 

Papp 35 


In the application of the correspondence principle there are cases when the 
resulting operators are not directly hermitian ones. To avoid the introduction 
of the non-hermitian operators, subsequent symmetrization devices were used. 
In line with Section 3 we shall consider that such symmetrization devices are in 
fact outside the proper theory. Consequently, the symmetrized operators 
cannot be principally used without also allowing the existence of the initial 
non-hermitian operators as physically meaningful. In such conditions we have 
to consider the initial non-hermitian operators as the binary operators which 
are able to originate the standard hermitian ones. 

Thus the classical expressions for the projection of the position vector on the 
momentum direction and of the free evolution time corresponding to that 
direction are 



p-r and f p =— rp 


respectively, where p = |p|. Applying directly the correspondence principle we 
obtain the pair of the mutually conjugated binary space operators 

''-$• ''-*-©■' ,8> 

and the associated pair of the binary time-operators 

t„ — /nor 

(4), r p =f*-m (ft)' 


respectively. We then easily obtain, in agreement also with Lippmann (1966), 
the hermitian space-time operators as 

ff-kfp + f'p) and ff^Wp + t'p) (10) 

respectively. Averaging the binary operators with respect to the non- 
relativistic wave packet 

<p(r, t) = (2ir)~ 1/2 I dpa(p) exp i(p • r- cat) (11) 

where / is the spatial dimension, there results 

w--(^»*«w} + '<'- 1 >(£) 



< f P >=-^-i arga(p) ) +/ ' (/ - 2) (i) (13) 

thus allowing the definition of the space and time imprecisions as the imaginary 
parts of the above averages. In order to define the time operator for/ = 1, the 

36 Uncertainty Principle and Foundations of Quantum Mechanics 

boundary condition 

lim pi 





is needed whereas for j = 2 and / = 3 the wave-packet form-factor has to be 
only bounded at the origin (Papp, 1974c). However, for / = 1, appreciable 
limitations are not implied as the subspace denned by the well-behaved 
condition (14) is dense in the whole Hilbert space. The above boundary 
conditions maintain their meaning also in the relativistic case. 

There is also a mutual connection between the binary description of the 
action and the one of space-time (Papp, 1974a). Indeed, the r-p-action 
average is given by 


< r .p>=-(p.A arga(p) ) + ,(^) +/ . 

so that the binary description of the action with the imprecision /(ft/2) implies 
the existence of the binary description of the time with the imprecision given by 
;'<ft/4<u> and vice versa. Alternatively, there is implied a binary description of 
the space shift 

or of the time shift 

(p'i arga(p) ) 

m (-^- — arga(p)^ 

with the imprecisions given byj(h/2p) and/<ft/4a>>, respectively. We can thus 
conclude that the purpose of the quantum-mechanical binary space-time 
description is indeed a double one: to perform the observable space-time 
(shifts) evaluations and to define the imprecisions of the space-time (shifts) 
In the relativistic case the binary space, time and action (p t - p • r) operators 

are given by 

''—(£)• '>'•(■?) and *-"*■(£) (16> 

respectively. Averaging the action operator a with respect to the Klein- 
Gordon wave packet 

<D (+) (r, t) = (2tt)- >/2 f d PT Ui (+) (p) exp (-ipx) 
J v2p 
where px=p t~P' r, one obtains 



Papp 37 

so that the implied space-shift and time (time-shift) imprecisions are 

■ W -(/-2>(£) ("> 



^ = (/-2)(^) 


respectively. Requiring now the action imprecision to be larger than ft/2, it 
results that the relativistic binary description of the action maintains ft/ 2 as the 
natural unit of the action only when 

m c ft ft 

2p 2 ~2 


i.e. when the existence of the threshold velocity (v2/2)c is quantum- 
mechanically allowed. In this respect we have to consider that for velocities 
larger than (V2/2)c, the elastically scattered particle (in the centre-of-mass 
system) ceases to preserve the initial single free-particle individuality. As a 
consequence we can no longer igno re the structure of the particle, so that — in 
the high-energy region (t>>(V2/2)c) — the particle is in fact replaced by the 
system of its constituents. In this respect we shall attribute a fundamental 
theoretical meaning to the threshold velocity (Jl/2)c, in agreement also with 
the fact that there is the same threshold velocity value which has been obtained 
within general relativity theory (Jaffe and Shapiro, 1972). 

We can also remark that the existence of the natural unit of the action implies 
the existence of the natural space unit 

8<s = 

2m c 

or, alternatively, of the natural time unit 

8 J 





which is binarily 'equivalent' to the constant h/2m c 2 . In such conditions the 
Compton wavelength ft/2m c can also be interpreted as the extent of the 
spatial localization region of the free particle. This interpretation is in agree- 
ment with the fact that the spatial localization (overlapping) of the free-particle 
field operators at two points in space is of the same order as the Compton 
wavelength (see e.g. Schroder, 1964; Griffith, 1974). However, the existence of 
factor V2 in the expression (23) needs some additional explanations. Firstly we 
have to mention that generally the threshold velocities implied by the space 
imprecisions are not identical with the ones corresponding to the time impreci- 
sions (Papp, 1973). On the other hand a special juncture arises when we 
compare the results of Schild (1948) and Hill (1955). Allowing the existence of 

38 Uncertainty Principle and Foundations of Quantum Mechanics 

the space-time quanta and imposing the (extended) requirement of Lorentz 
invariance it is implied that there is only a certain set of allowed velocities which 
corresponds to the existence of the integral-number co-ordinates (Schild) and 
of the rational-number co-ordinates (Hill), respectively. But in the particular 
case when the rational number is also an integral one, the above sets are 
not — as one would expect— identical. In such a situation it would be justifiable 
to conclude that— in the high-energy region — the existence of a certain velocity 
allowance cannot be overcome (when also imposing the Lorentz invariance 
condition within approaches supporting the existence of the space-time 
Averaging the binary time operator one obtains the imprecision 

S (i) t = (j 



which is larger than the expression (20) by the imprecision amount (l/2p ). As 


_P°_< 2 — 
2p 2 2p 

for v > (V2/2)c, we may conclude — by virtue of the above binary 
'equivalence'— that the average <l/2p > expresses in fact the time imprecision 
in the high-energy region. The binary space operator leads to the same space 
imprecision of (y - l)<l/2p). 

Analysing the binary meaning of the action operator tp -t • p, where t is 
now the time parameter, one obtains the action imprecision (y'-l)fi/2, the 
time-shift imprecision 



the time-imprecision 





and the space-shift imprecision 

Excepting the dimensional factor (y - 1), it can be easily remarked that, at the 
threshold velocity (\/2/2)c, the action and both the time imprecisions become 
identical with the ones of (18) and (22), respectively. Similarly, the r • p-action 
leads to the imprecisions 

thus confirming again, at least 'binarily', the non-equivocal values of the 
space-time imprecisions. 


The action operator (p , t p ) possesses the imprecision 

J 0) a = 


\ 2p 2 

2m \ 

Papp 39 


which implies, now in a separate direct way, the existence of the threshold 
velocity (V2/2)c for/ = 1. It is worthwhile mentioning that the action operator 
tp possesses the imprecision h/2. This operator is a binary one only for the 
Klein-Gordon particle. For the Dirac particle the binary action and space-time 
quantizations can be similarly performed. 

The imprecision of the binary proper-time operator is essentially given — 
now in a manifestly Lorentz invariant way — by the Compton wavelength 
(Papp, 1972b). In this respect further evidence is also given concerning the 
meaning of this length as the natural space constant of the free particle. In this 
case the following space- and time-shift imprecisions are implied 

*'"<=>(£)• s<l Hi) 


for the Dirac-particle, whereas 

^-</-i>(£). a».-(/-i)(-L) 

i P i 



are the space- and time-shift imprecisions for the Klein-Gordon particle. The 
implied time (t) imprecisions are given by 

^m 2 ) 

and (y 



respectively. The imprecision (p /2m 2 t ) is binarily compatible with the one of 
(po/2p 2 ) in the high-energy region: 

Po s2 Po 




for v > (v3/3)c. This latter threshold velocity agrees numerically with the one 
which has been defined in the general relativity theory, too (Treder, 1974). We 
may thus conclude that the existence of certain common threshold velocities 
allow us to assume that in fact some premises needed by a properly unified 
theory of space-time and matter have already been fulfilled. 


There is a certain formal analogy between the binary space-time description 
and the one of the space and time uncertainties. Indeed in both cases the 
existence of a certain interval associated with the measuring process is 

40 Uncertainty Principle and Foundations of Quantum Mechanics 

considered. But whereas in the first case the interval is the primary element of a 
theoretical description which has to express by itself the 'objective' imprecision 
of the measurement, the uncertainty interval (centred around the mean value, 
too) expresses in a rather conventional manner the general statistical accuracy 
limits of the measurement. In spite of these essential distinctions, we shall 
prove that certain space and time uncertainty contributions— the so-called 
uncertainty units— can be placed on the same footing as the space and time 
imprecisions (Papp, 1975). This fact is valid not only for the binary space-time 
operators, but also for the hermitian one-dimensional space operators. 

Let us begin with the evaluations of the space uncertainty for the Dirac 
particle. Using the relations 

(P.A)(P.A) ex p ip . r =_J 5 (p. r ) 2 expip-r 
\p dp/\p dp/ p 

where u(p, s) is the positive energy spinor, there results 

<M p ? >e><->')>(i> 

.((e.A arg6(p , 5) ))- 2 ^.A argt(p , s) ) 

where the /'-dimensional Dirac particle wave packet 

^ (+) (r, s, t) = Git)"" 2 f dp\p\(p, s) exp (-ipx) 
has been used. Allowing for simplicity the approximations 








Papp 41 


there results 

fr^M-®<H««'fc'>> (42) 

Ar^<#-ft> 2 -A«V+Ar ( * >a 

pj +^ r PJ 

(unit) 2 


*s B, -<(H'-i»M> 

is the square of the minimum space uncertainty, 




is the square of the space uncertainty unit and where Av is the square of the 
velocity uncertainty. The space uncertainty contribution Avt can be neglected 
taking formally t = 0. Using the Kronecker symbols, the expression (45) takes 
the form 

We can easily remark that in the two-dimensional case there arises an addi- 
tional contribution due to the space imprecision (l/2p). For/' = 1 and/' = 3, one 


a (unit) " 

• pJ 2m c 


4m c 

, 7 = 1,3 


for {v 2 )<c 2 /2. The above results confirm our assumption that the space- 
uncertainty unit possesses the physical meaning of the space imprecision. 

The calculations can be similarly performed for the Klein-Gordon particle, 
thus obtaining the results 

Arr-=(3-/)0-l){^)-(^) (48) 

so that 





We can see that in the one- and three-dimensional cases the space uncertainty 
unit of the Klein-G ordon particle takes the maximum imaginary value at the 
threshold velocity (V2/2)c: 

^""'"-w- '"" (50) 

42 Uncertainty Principle and Foundations of Quantum Mechanics 

In these conditions the minimum space uncertainty of the Klein-Gordon 
particle has to be generally smaller than half of the Compton wavelength. The 
above requirement would also signify that the (elastically scattered) Klein- 
Gordon particle is not necessarily a free one in the high-energy region. Indeed, 
the one-dimensional space uncertainty evaluation has to take positive values as 
the space operator is hermitian. On the other hand, if we require the minimum 
space uncertainty ft/2Ap to be larger than fc/4m c, it follows that Ap < 2m c. In 
these conditions, in order to assure— irrespective of any particular cases— the 
general validity of the quantum-mechanical description, we have^also to 
consider the condition (p)<2m c, as there exist cases when Ap = <p ). Con- 
trarily, there would exist cases for which Ap>2m c. Consequently the 
Compton wavelength is able to express the size of the Klein-Gordon particle 
only in the free case of a not too large energy (in order to also preserve the 
initial single particle individuality). 

In the non-relativistic case the square of the space-uncertainty unit is 




so that the space-uncertainty unit is 'binarily' identical to the space imprecision 

U/2p>. t u . . u 

The time-uncertainty units can be similarly calculated, thus obtaining the 

results (Papp, 1975) 





for the Dirac, Klein-Gordon and non-relativistic particles, respectively. 
Besides the space imprecision <l/2p), there is now the previously encountered 
collision time-shift imprecision (m /2p 2 ) which is also implied. It can now be 
easily shown that the squares of the space and time uncertainty units of the 
Dirac particle are larger than the corresponding squares of the Klein-Gordon 
particle by the amounts <l/4po) and <l/4p 2 ), respectively. These results are in 
fact in agreement with the expressions (31) and (32) thus proving, in this way 
also, the general inner consistency of the binary description. There is also a 
mutual compatibility of the space-time imprecisions with the space-time 
uncertainty units. 

We can show that there is not an irreconcilable difference between the space 
and time imprecisions. Thus the average (l/2p) is not only the space impreci- 
sion, but it possesses the meaning of a time imprecision, too. Similarly, the 
average (l/2p ) possesses also the meaning of a space imprecision. 


Papp 43 


Up to now the meaning and relevance of the Compton wavelength with respect 
to the (non-large energy) free (scattered) particle has been analysed. Proofs 
were given that the space-time imprecisions possess a well-defined physical 
significance within the quantum-mechanical collision time shift, binary space- 
time and space-time uncertainty descriptions. We shall now perform an 
approach which is able to define — in a unitary and direct manner — the 
Compton wavelength, the electron radius and (by extrapolation) the Bohr 
radius as natural space-time units. 

For this purpose let us assume that there is a 'spectrum' of the relevant space 
shift evaluations which correspond to the various levels of the quantum- 
mechanical description of matter. Such a space-shift 'spectrum' has to be 
described by the generalized 'eigenvalue' equation 

T P 8 '^ N i 


where N is the space-multiplicity parameter ('eigenvalue') and where, for 
convenience, a well-defined value of the angular momentum has been chosen. 
We shall also subsequently consider that the production processes which are 
expected to arise in the high-energy region can be qualitatively supported by a 
resonance emission approximation. The equation (55) has to define by itself the 
physical meaning of the space imprecision in the high-energy region. In this 
sense the equation (55) has to establish a close connection between the 
existence of the natural space-time units and the high-energy structural effects 
raised by the validity of the resonance emission approximation. Consequently, 
there is also implied a high-energy interacting-particle approach, as the 
collision interaction can be qualitatively supported by the formation and 
subsequent decay of a resonance state (Peres, 1966). In such conditions the 
above space shift has to be also considered as an interaction space shift (Papp, 
1972a). One would from the very beginning expect that among the physically 
relevant N-values there are the ones of N= 1 and N = 3 too. Indeed, in 
agreement with points (b) and (d) of the previous paper (Papp, 1973) the 
necessary condition for the binary variable to possess measurable meaning is 
N>1, whereas N>3 has to be considered the sufficient condition for the 
binary variable to possess (the well-defined) measurable meaning. 

Besides the above-formulated approach to the high-energy space-shift 
description, we can define another variant, by using the high-energy time 
imprecision (l/2p ). Starting from the (space-) time imprecision behaviour of 
the time shift 

-£-a,(p )=N-i- 
dp 2p 


44 Uncertainty Principle and Foundations of Quantum Mechanics 

one obtains the phase shift 

S,(po)=yln-^ (57) 

I mo 

where the present N-parameter is not necessarily identical with the one of 
equation (55). We shall consider for convenience— in agreement with the 
previous remarks— th e relation ( 56) as a space-shift relation, i.e. we shall take 
(dimensionally) p = Jp 2 +mlc 2 . The scattered state function corresponding to 
the above phase shift is given (in the energy representation) by 

<pT tt \p ) = gi(Po) sin 5,(po) exp iS,(p ) (58) 

Neglecting the influence of the wave-packet preparation, we shall take the 
form factor g,(p ) to be a constant. Consequently, the interaction time impreci- 
sion takes the form 

(J_)= dposin^tpo) dp — sin 2 fi,(po) (59) 

\2/V LJ mo J J ""> ZP ° 

/ 1 \ h tt(N 2 +\) ( 2tr \- 1 

so that 


where we have assumed that 

5 ( (p )e[0,7r],pr x) = 
It can now be easily verified that 

m exp — 

<£* w > : 

m c 



for N s: 1, where instead of e 2 /moC 2 we can also consider— without appreciably 
affecting the above approximation — the twice smaller value e /2m c . Simi- 
larly it results in 

(^■w) 5 — <63) 


2m c 
for N> 3. The existence of the inequality 

(^-5,(po))>— , N^4 
\dp I rn c 

agrees with the binary description formalism. Indeed, the time-shift evalua- 
tions for N = 3 and JV = 4 are binarily equivalent ones, as there are the space 
constants h/2m c and h/m c, too. We may thus conclude that the existence of 
the Compton wavelength and of the electron radius as the natural space units is 
a direct result of the binary description formalism of the space and time. 

In order to perform the resonance emission approximation we have to 
impose the maximum-value condition of the scattered state function (Kilian 

Papp 45 

and Petzold, 1970) 


M\ = 


so that the resonance energies are given by 


- „ w^) - 


m exp 


On the other hand the energy average is given by 

. w moc N 2 +l ( 4v \( 2tt X" 1 




so that the meaning of the narrow resonance approximation can be analysed in 
this way also comparing, for the same iV-values the expressions (66) and (67). 

= 23m , p r,3) ^2.87m , p rA) ^2.26m (68) 

PO = 



<p ) (1) -107m , (po) (3) ^3.12m , <p ) (4) - 2.46m 

Consequently, the above resonance emission approximation is mathematically 
consistent for the relatively larger Af-values, but it could be qualitatively 
accepted, in a larger sense, even for 7V= 1. Around the value N= 1, a very 
small variation of the N-parameter implies large variations of the (p ) and 
(1/po) averages. In these conditions the inequality 

<Po>sl07m (70) 

which is valid for Af? 1, can be practically replaced by the inequality 

(po)sl37m , Nsl (71) 

Consequently, the electron radius is able to fulfil its role both as natural space 
unit and as intrinsic size of the interacting electron only up to the 'elec- 
tromagnetic' threshold velocity v (em) (Papp, 1974b). 

As a consequence of the inequality (62), a lower bound of the N-parameter 
values can be defined. Indeed 




2m c 

so that 



>(— «i(p»)): 

2m c \dpo 

2m c 

he 137 




46 Uncertainty Principle and Foundations of Quantum Mechanics 

On the other hand, from the condition (61), one obtains 


p <m exp — <Po 

(max) _ 


= m exp — 


The so-defined upper energy bound agrees qualitatively with the upper cutoff 
momentum defined by Greenman and Rohrlich (1973). The extreme extrapo- 
Stion N-»« implies not only the breakdown of the high-energy space-time 
miction) imprecision description, but also the breakdown of the linear 
quantum electrodynamics, as that extrapolation leads to the appearance of the 

deep non-linear effects. ... » r i a „ tk,.. 

Another extrapolation can be performed towards the large N-values. Thus 

requiring the interaction shift to equal the Bohr radius: 

2m e 


it results that 



But the value 1/a is in fact an upper bound of the JV-parameter, as the energy 
average becomes in this case practically identical to the rest-mass energy. In the 
present case the Bohr radius is also a natural unit, but now with respect to 
another quantum-mechanical stratum of the bound states of the atomic 
electron. Indeed, the space-imprecision average performed with respect to the 
p-momentum representation state function of the hydrogen atom is given, for 
/ = 0,by 

\2pf T^fi 1 A£i2k-1 4n*-lJ 
where n is the main quantum number. The smallest space imprecision is now 
given by 8h 2 /3irm e. This result confirms the above assumption concerning 
the role of the Bohr radius as a natural space unit. 

There is formal analogy between the electrostatic and gravitational interac- 
tions of two point particles. Thus, whereas the Compton wavelength maintains 
unchanged its role as natural space unit, there is the Schwarzschild radius 
g(m /2c 2 )— where g is the gravitational constant— which corresponds toth^ 
classical electron radius. In these conditions the Planck radius (l/2c)Vftg/c 
(Planck 1913) is even the geometrical average of the so defined gravitational 
space units. It is justifiable to consider that the Planck radius is of a fundamen- 
tal theoretical significance (see e.g. Wheeler, 1957; Treder, 1963; Markov, 
1966- Motz 1972). In this sense this radius depends explicitly unly on the 
universal constants and it is also the space constant which takes the smallest 
value From the quantum-mechanical point of view we can consider— in a 
strong analogy with the results obtained for the electrostatic field by Heisen- 
berg and Euler (1936)— that the Planck radius is mutually connected with the 

Papp 47 

existence of the maximum observable value of the gravitational field strength. 
The meaning of the Planck radius as the critical distance value of the quantum 
gravity is in agreement with this latter result. 

We may thus conclude that the above-analysed natural space-time units are 
in fact various aspects of the same space-time imprecision ((l/2p) or <l/2p ))- 
The existence of the natural space-time units is actually required even by the 
mathematical binary description formalism, thus also proving both the rele- 
vance and the consistence of that formalism. In these conditions certain steps 
which are needed by a mathematically suitable description of the natural 
space-time units have been established, at least qualitatively. 


Throughout this paper certain evidence concerning the existence of the space- 
time imprecisions as inner elements of an extended quantum-mechanical 
description has been analysed and discussed. Proofs have also been given that 
the existence of the natural space-time units is mathematically consistent with 
the binary description formalism. It turns out that the binary formalism 
expresses essential aspects of the quantum-mechanical description of space- 
time and matter. The space-time imprecisions — which were removed from the 
standard quantum-mechanical description — imply a quite natural extension of 
quantum mechanics. The so extended formalism fulfils — at least for the 
moment — the requirements needed to define a quantum theory of the natural 
space—time units. One would also assume the present binary description 
formalism to be not complete, so that further developments need subsequent 
extensions and refinements. Thus the binary description formalism cannot be 
adequately applied to the coulombian or to the static gravitational interactions 
(between two point particles) without additionally assuming the existence of a 
certain discrete-space model (Papp, 1974b). Indeed, we can assume that the 
experimental conditions to perform measurements on the short distance 
behaviour are more restrictive than the ones required by the usual large 
distance measurements. In this sense we have to consider that the space 
discretization methods imply certain additional restrictions needed to perform 
adequately the short distance measurements. The possibility exists to define 
the existence of the maximum observable value of the electrostatic field 
strength and of the upper bounds of the particle rest-mass and electric charge, 
too. All these facts led us once again to conclude on the relevance and the deep 
physical significance of the space-time binary description formalism. 


Aharonov, Y. and Bohm, D. (1964) 'Answer to Fock concerning the time-energy indeterminacy 
relation', Phys. Rev., 134 B, 1417-1418. 

48 Uncertainty Principle and Foundations of Quantum Mechanics 

Almond D (1973) 'Time operators, position operators, dilatation transformations and virtual 
particles in relativistic and nonrelativistic quantum mechanics', Ann. Inst. Henn Pomcare, 19 A, 

Arnbt^mian, V. and Iwanenko, D. (1930) 'Zur Frage nach Vermeidung der unendlichen 

Selbstrfiskwirkung des Elektrons'.ZPnys., 64, 563-567 
Baz A I (1966) 'Life-time of intermediate states , Yader. Fiz., 4, 232-iou. 
Beck, G. (1929) 'Die zeitliche quantelung der Bewegung', Z. Phys., 51, 737-739. 
Blokhintsev, D. I. (1960) 'Fluctuations of space-time metric', Nuooo Omenta, lfc, mz-ik/. 
Blokhintsev! D.I. (1973) Space and Tune in the Microworld, Dordrecht, Boston. ,_.,_. 
Bohm, D., Hilley, B. J. and Stewart, A. E. G. (1970) 'On a new model of description in physics , Int. 

J. Theo'r. Phys., 3, 171-183. . . 

Brill, D. R. and Gowdy, R. H. (1970) 'Quantization of general relativity , Rep. Prop. Phys., 33, 

Broyles, A.' A. (1970) 'Space-time position operators', Phys. Rev 1 D, 979-988 
Bunge, M. (1970) 'The so-called fourth indeterminacy relation', Can. J. Phys., 48, 14 ^ 14U - 
Coish H R (1959)'Elementaryparticlesinafinite world geometry', Pnys. Rev., 114,383-388. 
Darling, B. T. (1950) 'The irreducible volume character of events. A theory of the elementary 

particles and of fundamental length', Phys. Rev., 80, 460-466. 
Das A (1960) 'Cellular space-time and quantum field theory', Nuovo Omenta, 18, 48Z-5U4. 
Dirac, P. A. M. (1972) 'A positive energy relativistic wave equation', Proc. Roy. Soc. London, 

\ja a 1—7 

Eberly, J. H. and Singh, L. P. S. (1973) 'Time operators, partial stationarity and the energy-time 

uncertainty relation', Phys. Rev., 7 D, 359-362. 
Fick, E. and Engelmann, F. (1964) 'Quantentheone der Zeitmessung , Z. Phys., 178, 551-562. 
Flint, H. T. (1935) 'A relativistic basis of the quantum theory', Proc. Roy. Soc. London, 150 A, 

Flint, H. T. (1937) 'Ultimate measurements of space and time', Proc. Roy. Soc. London, 159 A, 

Flint H T. (1948) 'The quantization of space and time', Phys. Rev., 74, 209-210. 
Flint' H T and Richardson, O. W. (1928) 'On a minimum proper time and its applications (1) to 
the number of chemical elements, (2) to some uncertainty relations', Proc. Roy. Soc. London, 

117 A, 637-649. ...... ^ a 

Fock V A. (1962) 'Criticism of an attempt to disprove the uncertainty relation between time and 

energy', Zh. Eksp. Teor. Fiz., 42, 1135-1139. ^ 17 m 

Fujiwara, I. (1970) 'Time-energy indeterminacy relationship', Prog. Theoret. Phys., 44, 1701- 

Fiirth R (1929) 'Uber einen Zusammenhang zwischen quantenmechanischer Unscharfe und 

Struktur der Elementarteilchen und eine hierauf begrundete Berechnung der Massen von 

Proton und Elektron', Z /Viys., 57, 429-446. 
Gien, T. T. (1965) 'Relativistic formulation of the lifetime matrix in the potential theory ot 

collision', /. Matfi. P/iys., 6, 671-676. 
Glaser, W. and Sitte, K. (1934) 'Elementare Unscharfe, Grenze des penodischen Systems und 

Massenverhaltniss von Elektron und Proton', Z. Phys., 87, 674-686. 
Goebel, G. J., Karplus, R. and Ruderman, M. A. (1955) 'Momentum dependence of phase shifts , 

Phys. Rev., 100, 240-241. 
Greenman, M. and Rohrlich, F. (1973) 'Is there a maximal electrostatic field strength? , Phys. Rev., 

8 D, 1103-1109. . . . , 

Griffith, R. W. (1974) 'Explicit formula from field theory for the average intrinsic size of a real or 

virtual photon', Nuovo Omenta, 21 A, 435-470. 
Heisenberg, W. (1927) 'Uber den anschaulichen Inhalt der Quantentheoretischen Kinematik und 

Mechanik*, Z. Phys., 43, 172-198. 
Heisenberg, W. (1936) 'Die selbstenergie des Elektrons'. Z. Phys., 65, 4-13. 
Heisenberg, W. (1938a) 'Uber die in der Theorie der Elementarteilchen auftretende umverselle 

Lange', Ann. Phys. Lpz., 32, 20-33. «...,.■.,. 

Heisenberg, W. (1938b) 'Die Grenzen der Anwendbarkeit der bishengen Quantentheone , Z. 

Phys., 110, 251-266. _ . j m t ^ ., . , „ 

Heisenberg, W. (1942) 'Die "beobachtbaren Grossen" in der Theone der Elementarteilchen , Z. 

Phys. 120, 513-538. L ^ . . D . 

Heisenberg, W. and Euler, H. (1936) 'Folgerungen aus der Diracschen Theone des Positrons , Z 

Phys., 98, 714-732. 

Papp 49 

Hellund, E. J. and Tanaka, K. (1954) 'Quantized space-time', Phys. Rev., 94, 192-195. 
Henning, H. (1956) 'Die Unscharferelation in der Dirac-Gleichungen und in der relativistischen 

Schrodinger-Gleichung', Z. Naturforsch., 11 A, 101-118. 
Hill, E. L. (1955) 'Relativistic theory of discrete momentum space and discrete space-time', Phys. 

Rev., 100, 1780-1783. 
Iwanenko, D. (1931) 'Die Beobachtbarkeit in der Diracschen Theorie', Z. Phys., 72, 621- 

Jaffe, J. and Shapiro, 1. 1. (1972) 'Lightlike behaviour of particles in a Schwarzschild field', Phys. 

Rev., 6 D, 405-406. 
Jauch, J. M. and Marchand, J. P. (1967) 'The delay time operator for simple scattering systems', 

Helv. Phys. Acta, 40, 217-229. 
Jordan, P. and Fock, V. (1930) 'Neue Unbestimmtheitseigenschaften des elektromagnetischen 

Feldes', Z. Phys., 66, 206-209. 
Kadyshevsky, V. G. (196 1) 'On the theory of quantization of space-time', Zh. Ekspenm. Teor. Fiz., 

41, 1885-1894. 
Kalnay, A. J. (1971) The Localization Problem, Studies in the Foundations Methodology and 

Philosophy of Science, Springer- Verlag, Berlin, 4, 93-100. 
Kalnay, A. J. and Toledo, B. P. (1967) 'A reinterpretation of the notion of localization', Nuovo 

Omenta, 48, 997-1007. 
Kaluza, T. (1921) 5. B. Akad. Wiss. Berlin, 966. 
Kilian, H. and Petzold, J. (1970) 'Zur Begriindung der Gamowschen Zerfallsthorie', Ann. Phys. 

Lpz., 24, 335-355. 
Kim, D. Y. (1973) 'A possible role of universal length in the theory of weak interactions', Can. /. 

Phys., 51, 1577-1581. 
Landau, L. and Peierls, R. (1931) 'Erweiterungdes Unbestimmtheitsprinzip fur die relativistische 

Quantentheone', Z. Phys., 69, 56-69. 
Latzin, H. (1927) 'Quantentheone und Realitat', Naturwissenschaften, 15, 161. 
Levi, R. (1926) 'L'atome dans la theorie de Taction universelle et discontinue', C.R. Acad. Sci. 

Paris, 183, 1026-1028. 
Lippmann, B. A. (1966) 'Operator for time delay induced by scattering', Phys. Rev., 151, 

Lord, E. A., Sinha, K. P. and Sivaram, C. (1974) '"Cosmological" constant and scalar gravitons', 

Progr. Theoret. Phys., 52, 161-169. 
Ludwig, G. (1972) 'An improved formulation of some theorems and axioms in the axiomatic 

foundation of the Hilbert space structure of quantum mechanics', Commun. Math. Phys., 26, 

Mandelstamm, L. and Tamm, I. (1945) /. Phys. U.S.S.R., 9, 249. 
March, A. (1936) 'Die Geometrie kleinster Raume', Z. Phys., 104, 93-99, 161-168. 
March, A. (1937a) 'Zur Grundlegung einer statistischen Metrik', Z. Phys., 105, 620-632. 
March, A. (1937b) 'Statistische Metrik und Quantenelektrodynamik', Z. Phys., 106, 49-69. 
March, A. (1937c) 'Die Frage nach der Existenz einer kleinsten Wellenlange', Z. Phys., 108, 

March, A. (1941) 'Raum, Zeit und Naturgesetze', Z. Phys., 117, 413-436. 
Markov, M. (1940) 'On the four "dimensionally" stretched electron in a relative quantum region', 

Zh. Eksperim. Teor. Fiz., 10, 1311-1338. 
Markov, M. A. (1965) 'Can the gravitational field prove essential for the theory of elementary 

particles?', Suppl. Progr. Theoret. Phys. (extra number), 85-95. 
Markov, M. A. (1966) 'Elementary particles with largest possible masses (quarks and maximons)', 

Zh. Eksperim. Teor. Fiz., 51, 878-890. 
Moglich, F. and Rompe, R. (1939) 'Uber einige Folgerungen aus der Existenz eines kleinsten 

Zeitintervalles', Z Phys., 740-750. 
Motz, L. (1962) 'Gauge invariance and the structure of charged particles', Nuovo Omenta, 26, 

672-697. by. 

Mote, L. (1972) 'Gauge invariance and the quantization of mass (of gravitational charge)', Nuovo 

Omenta, 12 B, 239-255. 
Olkhovsky, V. S. and Recami, E. (1970) 'About a space-time operator in collision description', 

Lett. Nuovo Omenta, 4, 1165-1173. 
Olkhovsky, V. S., Recami, E. and Gerasimchuk, A. J. (1974) 'Time operator in quantum 

mechanics (I. Nonrelativistic case)', Nuovo Omenta, 22 A, 263-278. 
Pa PP, E. (1972a) 'Interaction time measurement and causality', Nuovo Omenta, 10 B, 69-78. 

50 Uncertainty Principle and Foundations of Quantum Mechanics 

Papp, E. (1972b) 'The non-relativistic limit of a dynamical proper time', Nuovo Cimento, 10 B, 

Papp, E. (1973) 'Peculiarities of the quantum-mechanical space-time description', Int. J. Theoret. 

Phys 8 429—441. 

Papp E (1974a) 'Field theoretical space-time quantization', Int. J. Theoret. Phys., 9, 101-115. 

Papp e! (1974b) 'Imprecision description of the high-energy annihilation and production proces- 
ses', Int. J. Theoret. Phys., 10, 123-143. , 

Papp, E. (1974c) 'An extended approach to the field theoretical time operators , Int. J. Ineoret. 

Phys., 10, 385-389. .. . , . 

Papp, E. (1975) 'Meaning and bounds for the space and time uncertainty contributions , Ann. 

Pavlopoulos "t. G. (1967) 'Breakdown of Lorentz invariance', Phys. Rev., 159, 1 106-11 10. 
Penrose, R. and MacCallum, M. A. H. (1973) 'Twister theory: An approach to the quantisation of 

fields and space-time', Phys. Rep., 6 C, 243-315. ,„„. _ .„ ,„ 

Peres, A. (1966) 'Causality in S-matrix theory', Ann. Phys. (N. Y.), 37, 179-208. 
Peres, A. and Rosen, N. (1966) 'Quantum limitations on the measurement of gravitational fields , 

Phys. Rev., 118, 335-336. ,. „,„„ s 

Peters, P. C. (1974) 'Propagation in a space-time lattice , Phys. Rev., 9 D, 3223-32/8. 
Planck, M. (1913) Vorlesungen iiber die Theorie det Warmestrahlung, Johann Ambrosius Bartn, 

Leipzig, 167-169. 
Poincare, H. (1913) Dernieres pensees, Flaramarion, Pans. „,_,,„ 

Pokrowski, G. I. (1928) 'Zur Frage nach der Struktur der Zeif, Z. Phys., 51, 737-739. 
Proca A (1928) Sur la Theorie des Quanta de Lumiere.Blanchaid, Paris. 
Remak, B. (1931) 'Zwei Beispiele zur Heisenberg-schen Unsicherheitsrelation bei gebundenen 

Teilchen', Z. Phys., 69, 332-345. . 

Ruark, A. E. (1928) 'The limits of accuracy in physical measurements , Proc. Nat. Acad. aci. 

Was'h., 14, 322-328. ., ^ ., , ™. «1 nn_->89 

Schames, L. (1933) ' Atomistische Auffassung von Raum und Zeit , Z. Phys., »1, z/u-zsz. 
Schild, A. (1948) 'Discrete space-time and integral Lorentz transformations', Phys. Rev., 73, 

Schrddinger, E. (1930) 'Uber die kraftefreie Bewegungin der relativistischen Quantenmechanik', 

Berliner Berichte, 418-^28. ...... t- ,^u • > 

Schroder, U. E. (1964) 'Lokalisierte Zustande und Teilchenbild bei relativistischen Feldtheonen , 

Ann. Phys. Lpz., 14, 91-112. . . 

Sivaram, C. and Sinha, K. P. (1974) 'Gravitational charges and the quantization of mass, Lett. 

Nuovo amenta, 10,227-230. 
Smith, F. T. (1960) 'Lifetime matrix in collision theory', Phys. Rev., 118, 349-356. 
Snyder, H.S. (1947) 'Quantized space-time', Phys. Rev., 71, 38-41. 
Takano, Y. (1961) "The singularity of propagators in field theory and the structure of space-time , 

Prop. Theoret. Phys., 26, 304-314. 
Thomson, J. J. (1925/ 1926) 'The intermittence of electric force', Proc. R. Soc. Edinb., 46, 90-1 15. 
Treder, H. (1963) 'Gravitonen', Fort. Phys., 11, 81-108. 
Treder, H. (1974) 'Gravitationskollaps und Lichtgeschwindigkeit im Gravitationsfeld', Ann. Phys. 

Lpz., 31, 325-334. 
Vialtzew, A. N. (1963) 'Discrete space and time' (in Russian), Isd. 'Nauka', Moskwa. 
Wataghin, G. (1930) 'Uber die Unbestimmtheitsrelationen der Quantentheorie', Z. Phys., 65, 

,Wataghin, G. (1932) 'Zur relativistischen Quantenmechanik', Z. Phys., 73, 121-129. 
Wheeler, J. A. (1957) 'On the nature of quantum gedmetrodynarnics', Ann. Phys. (NY.), 2, 

Wheeler, J. A. (1962) Geometrodynamics, Academic Press, New York. 

Wigner, E. P. (1955) 'Lower limit for the energy derivative of the scattering phase shift', Phys. 

Rev., 98, 145-147. „, . ¥T . ._,„_. 

De Witt, B. S. (1960) The Quantization of Geometry', Institute Field Physics, University of North 

Yang, C. N. (1947) 'On quantized space-time', Phys. Rev., 72, 874. 

Uncertain Cosmology 


University of York, England 


The particularity of one's presuppositions should never be underestimated. I 
write from a set of assumptions which to the quantum physicist may seem 
peculiar: that one can and should provide a coherent mathematical scheme for 
the entire universe; that the scheme should admit a model that represents the 
universe as we perceive it to be, with ourselves as observers in it; that the 
occurrence of such a model within the scheme can, in some sense, explain both 
our own coming-into-being and also the nature of what we now observe. 

Most interpreters of quantum theory, being concerned only with certain 
delineated subsystems of the universe, can with propriety assume the prior 
existence of the macroscopic world of daily experience as a background given 
in advance, in terms of which the subsystem must be explained. By rejecting 
this in favour of a cosmological view,* I am forced into the contentious position 
of using a quantum theory which includes the observer in the formalism. More 
than this: I wish to do it in a way consistent with a general relativistict treatment 
of the universe, since such a treatment uniquely combines logical elegance with 
observational consistency. This raises three particular problems. 

(1). If Newtonian space-time be abandoned, there is no reason, either 
physical or philosophical, to assume that the global properties of 
space-time should be the same as the local properties which our 
short-range observations reveal. In particular (as I have argued in detail 
elsewhere (Clarke, 1976)) the universe of general relativity may not 
admit any global time coordinate: if its existence (i.e. stable causality) is 

Certain cosmologists (Dicke, 1961 ; Collins and Hawking, 1973) have used our own existence as a 
constraint on cosmological parameters. But this only gives a non-tautologous explanation of the 
universe if the physical schema used is able to generate a reasonably restricted class of cosmological 
models before such a constraint is imposed; it does not absolve us from developing models which 
« e P°. ten tiaUy independent of our own existence. 

TWithin the term 'general relativity' I include all theories where space-time is represented as a C 1 
manifold on which local inertial frames are specified by a metric of Lorentz signature, determined 
along with other physical entities by field equations which may not necessarily be (either of) those 
of Einstein. 


52 Uncertainty Principle and Foundations ol Quantum Mechanics 

assumed then it must be recognized as an additional postulate made 
either in anticipation of future experimental evidence or as a temporary 
expedient to ease the calculations. But without the assumption of global 
time the traditional quantum-mechanical picture of a system evolving in 
time is untenable. 

(2). In a general space-time there is no symmetry group which will enable 
one to Fourier analyse a quantum field to give it an unambiguous 
particle interpretation. The definitions of particle number, particle 
creation rate, etc., are, in this situation, a matter of great controversy 
(Unruh, 1974), 

(3). Since the structure of space-time is itself a dynamical variable, and not 
merely a fixed arena for other events, it must itself be quantized: firstly, 
because its source is composed of quantized fields; and, secondly, 
because it seems likely that there are regions of the universe where the 
space-time curvature is characterized by a length-scale small enough to 
be in the quantum domain. But the quantization of space-time is not 
only technically difficult. In addition, the removal of both a fixed 
background space-time and a reliable particle-representation leaves 
very little structure on which to hang an interpretation of any formalism 

This third point leads one to the central difficulty of quantum cosmology: if 
everything is quantized— space, time, particles, observers— then everything 
dissolves into a structureless haze from which it is impossible to extract any 
semblance of concrete reality. 

Such a viewpoint is intimately related to the place afforded to the uncertainty 
relations. For, as conceived by Heisenberg (1930) 'the statistical character of 
the relation (between values of dynamical quantities) depends on the fact that 
the influence of the measuring device is treated in a different manner than the 
interaction of the various parts of the system on one another . . . The chain of 
(determinate) cause and effect could be quantitatively verified only if the whole 
universe were considered as a single system— but then physics has vanished, 
and only a mathematical scheme remains' (p. 58). The more cosmology has 
developed, the more this observation of Heisenberg has been confirmed, that 
the simple extension of quantum theory to the cosmological domain yields a 
mere 'mathematical scheme' that stands in need of something else before 
physics can emerge. For him this addition comes through alternative descrip- 
tions which (Heisenberg, 1974) are 'complementary' to quantum theory in 
being compatible with it, but not deducible from it. 

The course of this chapter is the pursuit of this 'something else' in a context 
beyond the usual laboratory one: the context of a cosmological picture 
containing physically extreme regions where no distinction, however arbitrary, 
can be made between the observer and the observed, the 'measuring device' 
'and the 'parts of the system', in Heisenberg's terms. Here the use of a 
complementary description in the original restricted sense of Bohr (1928) is 

Clarke 53 

impossible, and the uncertainty relations will not, as in the laboratory case, 
appear as a limitation on the applicability of a classical corpuscular description 
(Heisenberg, 1930). 


First we must examine an unusual cosmological view which seems to offer the 
hope of dispensing with any addition to quantum theory. Having given a 
detailed philosophical critique elsewhere (Clarke, 1974), I shall here sum- 
marize the conclusions and develop further mathematical points. 

The theory (Wheeler, 1957; Everett, 1957; de Witt and Graham, 1973) 
regards the universe as a single quantum mechanical system whose state vector 
¥ undergoes a determinate Hamiltonian evolution in a Hilbert space #f. When 
an observation is taking place 36 decomposes into the tensor product #f M ® %?s, 
where 5if s describes the observed microsystem and 5if M describes everything 
else. Correspondingly, ¥ = ¥ M ® V s . 

Suppose that 1 J r s = Z,^Vi> where tyi, fa, . . . are eigenstates in which the 
quantity being measured has a definite value. The usual theory of measurement 
is followed, according to which the state ^m ® *lh evolves (approximately) into 
a state ^m ® «Ai, where ¥m represents the state of affairs before the measure- 
ment is made while ^m represents the state after a definite value, correspond- 
ing to t/r„ has been found. Thus the measurement as a whole is described by an 
evolution from ¥m®¥ s = *m® (La'<l>i) into I,, a'tfft ® fr. 

The Wheeler-Everett theory interprets this last state as the simultaneous 
existence of many copies of the universe, one for each index i. This means that 
at each measurement the universe splits into many branches, each branch 
corresponding to a separate definite value for the measurement. Every person 
in the universe is thus split into many copies which henceforth evolve indepen- 
dently of each other (by linearity). Each copy is, by construction, aware only of 
the (definite) outcome of the measurement in his branch so that to him it 
appears as if the state- vector has 'collapsed' onto an eigenstate. 

All the branches are equivalent: there is no need to try to attach meaning to 
one branch being 'more likely' than another. The statistical interpretation of 
quantum theory is derived, not from such an additional postulate, but from a 
consideration of long sequences of experiments. From this it is shown that, in 
the limit as the lengths of the sequences tend to infinity, in all branches of the 
universe except for a set of measure zero the relative frequencies of the various 
possible outcomes for the experiment accord with the usual quantum mechani- 
cal probabilistic interpretation of the a'. 

Now, it could well be argued that this splitting provides a mental picture that 
is simpler and more economical of hypothesis than that provided by the 
collapse of the wave packet' description of von Neumann (1955) and others. 
But in terms of physical verification the two approaches are equivalent, and 

54 Uncertainty Principle and Foundations of Quantum Mechanics 

both stand in need of definite criteria which determine when it is that a 
measurement takes place and how the Hilbert space is to be decomposed; 
information that is assumed to be given ab extra in both approaches to quantum 
theory. But in cosmology, by definition, nothing is extra to the dynamical 
formalism; there are no external fields— not even the geometrical structure of 
space-time if, as in general relativity, this geometry is itself a quantized 
dynamical variable. Thus I propose the following thesis: a system of quantum 
cosmology, as it is usually understood, cannot contain enough intrinsic struc- 
ture to allow one to use criteria which might characterize the occurrence 
of measurement situations. * 

Let me clarify this by a comparison with a classical case. There the dynamical 
system might be represented by N functions R-»R 3 (specifying the positions of 
the N particles of the system as functions of time) determined by N coupled 
ordinary differential equations. In interpreting such a system, it suffices to take 
an existing structure within the mathematics (the relative positions of the 
particles) and match it with a corresponding observed structure. In quantum 
mechanics, however, the system is more usually represented mathematically by 
one function R -»• 3€ (the vector in Hilbert space as a function of time), again 
governed by a differential equation. In interpreting this it is not enough to 
match a structure intrinsic to the mathematics with something observable: the 
interpretation itself sets up further implicit mathematical structure by distin- 
guishing between different vectors in the Hilbert space. 

Note that there is in this respect no essential difference between classical and 
quantum systems, only a difference in the presentation. The classical system 
could be presented as one function R->- R 3 ", while the quantum system could be 
presented as an infinite set of equations for the coefficients of the state in a 
particular basis, and both are often found. But in quantum cosmology most of 
the structure beyond the specification of the Hamiltonian is ambiguous, and yet 
it contains virtually all the physical information. It is therefore vital to be 
self-conscious about the amount of structure already in the Hamiltonian, and 
the extent and role of the additional structure that is required. 

This can be illustrated in three cases. 

2(a) Field Theory 

Let us first suppose that the quantum system is arrived at by using a background 
(flat) space-time. The customary procedure would be to set up a Fock-space or 
equivalent representation of the free fields, and then introduce some interac- 
tion. For a free field the pair (5?, H), where W is the Hilbert space and H the 
Hamiltonian, conveys almost no information at all. The only intrinsic structure 
for such a pair is given by the spectral type of H (Plesner, 1969), and a free-field 
Hamiltonian has merely a homogeneous spectral type covering the positive 
reals with multiplicity d (the cardinality of the integers). The system only 
acquires some physical content when a particular Fock-space decomposition of 
$? is specified. 

Clarke 55 

When it comes to interacting fields, the main problem is one of establishing 
what $f is by some renormalization procedure. In the absence of any rigorous 
treatment of this, one can only speculate as to the structure of (Sif, H); but it 
would be surprising if it differed at all from the free-field case. 

In actual practice very little interest is paid to the nature of $f, the stress being 
entirely on the operator algebras derived from the fields. By virtue of their 
interpretation, these have a very rich implicit structure. But in cosmology it is 
the states which must be related to the universe we observe, since there are no 
external measurements to correspond to 'observables'; and so I shall concen- 
trate on structure as it is manifested in the set of states. 

2(b) Infinite Fock Space 

The foregoing must be modified for cosmological application, in view of the 
likelihood (Schramm and Wagoner, 1974) of the universe being spatially 
infinite. Then it is more realistic to allow the space of 'states' to include 
descriptions of infinitely many particles (and not merely unboundedly many, as 
in Section 2(a)). A complication then arises when Fermi or Bose-Einstein 
statistics are needed because the appropriate 'state'-space 5if* is not a subspace 
of the infinite tensor product $f°° of one-particle states 

5T=® (%) <0 

but a subspace* of its algebraic dual. Sif* is in fact not a Hilbert space; but a 
Hamiltonian is defined in it as the dual of an appropriately defined operator on 
the Hilbert space 5Jf°°, and the arguments of Section 2(a) apply to this. 

Because 3€* is not a Hilbert space, its elements cannot be regarded as 
interpretable states. It has, however, a concept of orthogonality and hence of 
orthogonal projection. A (mixed) state can then be defined as a subadditive and 
orthogonally additive function of the projectors into [0, 1]: there are no pure 
states. As it does not seem to be known whether there is a representation 
theorem for this situation (of the type given, for instance, by Langerholc 
(1965)) one cannot pursue the matter much further. I shall assume that it will 
ultimately be possible to deal with these states in the same way as mixed states 
in the separable-Hilbert-space, to which I now turn. 

2(c) Mixed States 

Suppose one has the situation in Section 2(a), except that now the triple 
(Sif, H, <t> ) is given, where <J> is a mixed state (a self-adjoint operator of trace 

Explicitly Sif* is the annihilator of the subspace of 5if°° consisting of all finite combinations of 
vectors of the form 

e a) ®...®e 0, ®...®e 0) ®...±e <1) ®...®e (o ®...®e </) ®... 

1 i/l / i 

where / (t) denotes/ as a member of (X f k \ the fcth copy of X and the + (resp. -) sign gives Fermi 
fresp. Bose-Einstein) statistics. 

56 Uncertainty Principle and Foundations of Quantum Mechanics 

class) representing the universe at some initial time t . There is now room for 
considerable complexity since H and 3> will not in general commute; but I shall 
argue that interpreting quantum theory by using only this information, while 
not impossible, is highly unsatisfactory: the available structure cannot provide 
an acceptable basis for decomposing a vector into a superposition of macro- 
scopic states, as is required by any interpretation, including that of Wheeler and 

At first glance an obvious procedure presents itself. Suppose that we want to 
give an intrinsic characterization of the measurement process described in 
Wheeler-Everett terms at the start of this section, where 3€ splits* into #fi © #f 2 
and <& decomposes accordingly into 

<D = (0„ + $12) ° Pi + (*2i + #22) ° P2 

(where 3> iy : 3Sfj -»■ % ^ <= X, P, : X -* %). It is a characteristic of the measurement 
situation (Daneri, Loinger and Prosperi, 1962) that in these circumstances 
statistical processes ensure that ||<t»i 2 || = ||<J»2i||-> 0, while the remaining compo- 
nents are H-stable in that [<J>n, H] maps (approximately) from 5if x to #fj and 
similarly for #f 2 . Thus we can try to identify ^ and #f 2 by diagonalizing <J> and 
then looking for partitions of the set of eigenvectors into subsets which each 
span subspaces stable under H. 

However, this procedure cannot separate out the subspaces $?i and $f 2 if the 
probabilities of the two corresponding experimental results are almost equal- 
as may happen if there is an exact symmetry between the two microscopic end 
states of the system being observed. In that case the nature of the diagonaliza- 
tion will be heavily influenced by the residual off-diagonal terms <&i 2 , which 
would cause the procedure to select as possible outcomes states which should in 
fact be regarded as unacceptable superpositions of macroscopically distinct 

The difference between the values of the various probabilities involved (the 
eigenvalues of 3>) is not the essential criterion in distinguishing possible 
outcomes. That criterion is the macroscopic dissimilarity of the various pos- 
sibilities, and the diagonalization of <& is linked with it only in certain cases. In 
general one must recognize the existence of some fundamental structure which 
corresponds to this dissimilarity, and is not equivalent to any intrinsic property 


Let us suppose that as well as a Hubert space and a Hamiltoniant we have also 
as additional structure a particle decomposition, in the form of a Fock-space 
representation of 5if (supposing, for simplicity, a finite but unbounded number 

*Here STj and 3Sf 2 correspond to two different outcomes to the measurement. 

tf or the sake of definiteness, and to remain with familiar territory, I have phrased my account in 

terms of a conventional (Sif, H) formalism; but this is by no means essential, and is probably 


Clarke 57 

of particles). This might seem very drastic, but it is hard to see how one can get 
away with any less. One consequence of this is that the problem of defining 
particles in curved space in _ order to study particle creation in cosmology 
(Parker and Fulling, 1972) is now reversed: particles are supposed given at the 
outset, and the problem is now to define the space in which they are situated. 
Instead of particle-creation one has space-annihilation. 

Although particles may be part of the a priori structure, they cannot directly 
give the sort of information needed to define macroscopic states and measure- 
ments. The universe cannot always be split into apparatus and microsystem 
along particle lines, since particle number could be one of the dynamical 
quantities being measured. The process of defining macroscopic states must be 
two-stage: first, the particle structure has to define a spatial structure; then 
criteria for macroscopic interpretability have to be formulated in terms of 
spatial structure. 

The basic ideas for passing from particles to space are fairly simple, and 
limited progress has been made in their formal articulation. Consider an 
N-particle state of the form <5(fa ®fa®... ®<^jv) where © denotes (anti-) 
symmetrization. We may try to associate a 'distance' between fa and fa by 
examining \(fa\H\fa)\ 2 ; the greater this quantity the greater the probability of 
interaction between fa and fa and so the closer their 'distance'. The develop- 
ment of this idea (Penrose, 1972) has actually been in terms of models where 
there is no explicit Hamiltonian, but only quantities which may be thought of as 
scattering amplitudes. In the simplest case a geometry can be defined in terms 
of these quantities, which is a geometry of Euclidean directions. An important 
feature is that if one first sets up the amplitudes by using a conventional 
description, based on particle states which do not define precise directions, 
then the geometry which is deduced is still a normal Euclidean direction- 
geometry, but one that is not related in any simple way to the space with which 
one started. On this approach particles come first, and the 'real' space is the one 
which they define. 

One can speculatively indicate the form which a cosmology based on this 
might take. An N-particle state, as above, could be examined to establish a 
rough criterion of 'nearness', or locality. Then a more detailed geometry could 
be defined which held locally, in the sense that Minkowski geometry holds 
locally in general relativity. Still working in terms of a group of nearby particles, 
states would certainly be regarded as macroscopically distinct if there was no 
isomorphism of the geometry which made a suitably smoothed-out particle- 
density for the two states coincide, even approximately. (More precisely, this 
would give a criterion for the distinguishability of two mixed states, each 
corresponding to specific states for a local group of k of the N particles, the 
remainder being unspecified.) Finally, transition probabilities would have to be 
specified between states which were macroscopically interpretable (i.e. which 
were not superpositions of distinguishable states). This would be done using 
either a Hamiltonian formalism, or combinatorial laws which arise more 
naturally in the twistor theory development already cited (Penrose, 1972). 

58 Uncertainty Principle and Foundations of Quantum Mechanics 

Note one advantage of this emphasis on mixed states: one can pass with little 
modification to the infinite inverse described in Section 2(b), where there are 
no pure states. 

If the theory were to progress along lines like these, one can discern three 
aspects which would prove especially interesting. 

3(a) Renormalization 

Infinities arise both from the unboundedly large momenta that occur in loops, 
and from the unboundedly large number of intermediate particles that can 
appear in the perturbation expansion for an interaction. The first divergence, 
with which renormalization concerns itself, is an essential part of the usual 
space-time descriptions used and of the field-theory approach of adding 
interactions onto a basically free-field structure. The theory I am envisaging 
must start with a system of finite matrix elements, i.e. it must already be 
renormalized. This is another point in favour of twistor theory, which automat- 
ically yields finite answers to scattering problems (Penrose, 1975). 

The second infinity (caused by non-convergence of perturbation theory 
expansions) is unlikely to arise, because particle density is automatically 
limited in a theory which places particles before space. As more particles 
appear, so more space appears, since space is simply the numerical relations 
between the particles. 

3(b) Local and Global Aspects 

In general relativity it is assumed that the local aspect of the universe to which 
we have immediate access is similar to all other local aspects, and that these 
local glimpses can be pieced together into a global model. In the quantum case 
there seems to be no reason why this should be so. That is to say, while the 
theory refers to the universe as a whole, there is no reason to suppose that there 
should be globally defined macroscopic states and transition probabilities. The 
desire for this, which haunts much of current work in quantum cosmology, 
stems from the mistaken attitude of viewing the universe from the outside, as 
we would observe an atom in a crystal. In reality we are part of the universe and 
the states with which the scientist is concerned are states relative to himself. All 
our observations are local, at least in the sense that the domain of galaxies over 
which they extend is one in which the geometry departs only modestly from the 
Euclidean; and, on a purist view, they are very local, in that we are concerned 
directly only with photons and particles which arrive here on earth, and 
arguments as to their source are a matter of indirect inference. 

So we should not be disappointed if we fail to obtain a god-like view of the 
universe as a whole. What we can demand is that which is scientifically testable: 
a theory which enables us to predict and understand future observations in 

Clarke 59 

terms of present ones. In practice, of course, one would make use of the global 
understanding of the universe which we think we already have: a mixed state in 
which a local group of particles only is specified can be regarded as a mixture of 
globally specified states, each of which can be analysed by analogy with a 
conventional cosmology. But since our present observations do not single out 
any unique cosmological model, many global models will be compatible with 
them; some having a global time, some being acausal and so on. All these 
participate in the mixed state which is defined by our observations, and define 
the states into which it may, probabilistically, turn. 

The basic structures of the theory are global and comprehend the entire 
universe. But their translation into observation requires the selection of some 
particular viewpoint, in the form of a local group of particles small enough to 
enable a spatio-temporal structure to be defined. The 'additional structure' in 
the theory specifies the totality of possible viewpoints, one of which is ours. 

3(c) Uncertainty 

The probabilistic nature of the predictions which emerge could, if one wished, 
be ascribed to some kind of complementarity between the Hilbert space 
description and the macroscopic-state description. I would see this as unneces- 
sarily dualistic, preferring to regard the Hilbert space structure as a kind of 
scaffolding on which to hang and manoeuvre the macroscopic states, and to use 
to calculate the transition probabilities. The physics is a physics of the macro- 
scopic states, and the relations between them are by their nature probabilistic. 
This enables us to return to the uncertainty relations in a cosmological setting, 
when they become relations constraining an intrinsically probabilistic scheme. 
They entail limitations on our measuring abilities, rather than being conse- 
quences of limitations, because the probabilistic structures of which these are 
an instance are there constraining the physics of the universe even if nothing is 
happening which could conceivably be called a measurement. 

In short, the indeterminate physics which is uncovered in our laboratories is 
simply an aspect of the entire uncertain cosmology of which it is a tiny part. 


Bohr, N. (1928) 'The Quantum Postulate and the Recent Development of Atomic Theory', Nature, 

121, 580-590. 
Clarke, C. J. S. (1974) 'Quantum theory and cosmology', Phil. Sci., 41, 317-332. 
Clarke, C. J. S. (1976) 'Time in general relativity', To appear in Minnesota Studies in the Philosophy 

of Science. 
Collins, C. B. and Hawking, S. W. (1973) 'Why is the universe isotropic?' Astrophys. J., 180, 

Daneri, A., Loinger, A. and Prosperi, G. M. (1962) 'Quantum theory of measurement and 

ergodicity conditions', Nucl. Phys., 33, 297-319. 
De Witt, B. S. and Graham, N. (Eds.) (1973) The Many -Worlds Interpretation of Quantum 

Mechanics, Princeton University Press, Princeton. 

60 Uncertainty Principle and Foundations of Quantum Mechanics 

Dicke, R. H. (1961) 'Dirac's cosmology and Mach's principle', Nature, 192, 440-441. 

Everett III, H. (1957) ' "Relative state" formulation of quantum mechanics', Rev. Mod. Phys., 29, 

Heisenberg, W. (1930) The Physical Properties of the Quantum Theory (trans. Eckart, C. and Hoyt, 

F. G), University of Chicago Press, Chicago. 
Heisenberg, W. (1974) 'Double dialogue', Theoria to Theory, 8, 11-34. 
Langerholc, J. (1965) 'The trace formalism for quantum mechanical expectation values', /. Math. 

Phys., 6, 1210-1218. 
Penrose, R. (1972) 'On the nature of quantum geometry', Magic without Magic: John Archibald 

Wheeler, Klauder, J. R. Ed., W. H. Freeman and Co., San Francisco. 
Penrose, R. (1975) Quantum Gravity, Isham C. J., Penrose, R. and Sciama, D. W. (Eds.), Oxford 

University Press, Oxford. 
Parker, L. and Fulling, S. A. (1972) 'Quantized matter fields and the avoidance of singularities in 

general relativity', Phys. Rev. D, 7, 2357-2374. 
Plesner, A. I. (1969) Spectral Theory of Linear Operators, Vol. II, Ungar, New York. 
Schramm, D. N. and Wagoner, R. V. (1974) 'What can deuterium tell us?' Phys. Today, 27, (12), 

Unruh, W. G. (1974) 'Alternative Fock quantization of neutrinos in flat space-time', Proc. Roy. 

Soc. London A, 338, 517-525. 
von Neumann, J. (1955) Mathematical Foundations of Quantum Mechanics, trans. Beyer, R. T., 

Princeton University Press, Princeton. 
Wheeler, J. A. (1957) 'Assessment of Everett's "Relative State" Formulation of Quantum 

Theory', Rev. Mod. Phys., 29, 463-465. 


Uncertainty Principle and the 

Problems of Joint Coordinate-Momentum 

Probability Density in Quantum Mechanics 


Peoples' Friendship University, Moscow, U.S.S.R. 


The 50-year old history of the development of quantum mechanics has been 
extremely rich in attempts »to reconsider its interpretation, to alter its 
mathematical formalism and finally to create a new theory that would provide a 
more complete description of physical reality than the one offered by quantum 
mechanics. Among the investigations conducted in this field are those devoted 
to the search for the singular solutions of the equations of quantum mechanics 
(De Broglie, 1956; De Broglie and Andrade e Silva, 1971)rthe search for 
particle-like solutions of the non-linear field theory (Finkelstein and cowor- 
kers, 1956; Glasko and coworkers, 1958; Rybakov, 1974); the attempts to 
introduce all kinds of 'hidden' parameters (Bohm, 1952; Pena-Auerbach and 
coworkers, 1972); the realization of various stochastic approaches to quantum 
mechanics (Fenyes, 1952; Bess, 1973); the attempts to explain quantum 
phenomena by the existence of an 'imaginary' or 'hidden' thermostat, 'sub- 
quantum medium' (Bohm and Vigier, 1954; Terletsky, 1960; De Broglie, 

The authors of such investigations usually proceed from the assumption that 
generally accepted quantum mechanics does not completely describe the 
physical reality and that it is possible to create a more profound theory which 
would treat all experimentally measurable quantities as simultaneously exist- 
ing physical realities. 

The incompleteness of the quantum-mechanical description was implicit in 
the earliest works of the founders of quantum mechanics (De Broglie, 1927), 
and since 1935 it has been a kind of an accusation against the fully established 
quantum mechanics (Einstein and coworkers, 1935; Schrodinger, 1935). How- 
ever, the thesis of the incompleteness of the quantum-mechanical description 
remains unproved so far. This is because all the proofs of the incompleteness 
can easily be refuted on the grounds that quantum mechanics, owing to the 


62 Uncertainty Principle and Foundations of Quantum Mechanics 

well-known Heisenberg uncertainty principle, 


rejects the concept of the coordinate and the momentum of a system existing 
simultaneously as physical realities. But the statement of the completeness of 
the quantum-mechanical description remains an assumption that has not been 
proved either. This stimulates a search for a new theory, more profound than 
that of quantum mechanics; the existence of such a theory has not been 
doubted by many outstanding physicists (Einstein, 1948; De Broglie, 1953; 
Schrodinger, 1955). 

In the construction of the above-mentioned profound theories the principle 
of the correspondence between the sought-f or theory and quantum mechanics 
plays a major part. In the opinion of most investigators this correspondence 
means that the new theory must explain the fundamental propositions of 
quantum mechanics as of certain statistical theory which appears when the 
completeness of the description of physical reality is partially sacrificed (when 
certain statistical averaging is undertaken). Thus, quantum mechanics lays 
quite definite claims to the sought-for profound theory. It is quite natural that 
the authors of different profound theories are anxious to satisfy these claims 
first and foremost. 

In its turn the assumption that the sought-for profound theory exists lays 
certain claims oh quantum mechanics itself. This circumstance is usually 
neglected by most investigators. Meanwhile, it is the main obstacle in the way 
of creating a profound theory. This can be illustrated by the following 

Quantum mechanics in spite of its obvious and generally acknowledged 
statistical character is not a theory of the consistent probability nature. It 
does not make use of any joint probability distributions for physical quantities, 
for example for coordinate and momentum, it defines no conditional prob- 
abilities. This fact leads to no contradictions within quantum mechanics since 
the quantum-mechanical description does not require that all physical quan- 
tities be considered as simultaneously existing realities. 

Let us assume now that there exists a theory giving a more complete 
description of physical reality than quantum mechanics and treating all quan- 
tities as simultaneously existing physical realities. Let this theory with the 
completeness of the physical reality description partially sacrificed (certain 
statistical averaging employed) lead to a statistical theory, coinciding with 
quantum mechanics. But renouncing the completeness of the description and 
resorting to probabilities in this theory we shall inevitably arrive at a statistical 
theory, in which along with the probability of values of the physical quantity A i 
and the probability of values of the quantity A 2 there will exist a joint 
probability of the values of quantities A x and A 2 the probability of values of the 
physical quantity A provided that A x has a certain definite value A[, i.e. the 

Kuryshkin 63 

conditional probability. In other words, the statistical theory thus obtained, 
and coinciding in accordance with the tentative assumption with quantum 
mechanics, will inevitably follow the conventional probability scheme. 

Hence, if the sought-for profound theory exists, then the concepts of joint 
and conditional probabilities can be introduced into quantum mechanics, i.e. 
quantum mechanics may be reduced to a consistent probability scheme. 

Attempts to introduce the concepts of the joint probability density for 
various physical quantities and in the first place for a certain joint coordinate- 
momentum distribution (quantum distribution function, QDF) have been 
made repeatedly. The earliest works in this field (Wigner, 1932; Terletsky, 
1937; Blokhintsev, 1940) did not aim at introducing the QDF into quantum 
mechanics and considered the proposed phase space functions only as possible 
mixed representations of the density matrix, which later proved extremely 
useful in concrete quantum-mechanical problems (Klimantovitch and Silin, 
1960; Imre and coworkers, 1967; Arinshtein and Guitman, 1967; 
Gorshenkov and coworkers, 1973). It was only in 1949, that an attempt to 
interpret Wigner's function as a QDF was made apparently for the first time 
(Moyal, 1949). However, Moyal's statistical interpretation of quantum 
mechanics did not gain much support, since the sign-variability of Wigner's 
function prevents it from being treated as the joint coordinate-momentum 
probability density. In subsequent years a few more concrete functions that 
might be considered as QDF were suggested (Bopp, 1956; Margenau and Hill, 
1961; Mehta, 1964; Cohen, 1966a, b; Shankara, 1967; Kuryshkin, 1968; 
Ruggery, 1971; Zlatev, 1974). Except for Bopp's function they all turned out 
to be sign-variable. Besides, investigations showed (Mehta, 1964; Cohen, 
1966; Kuryshkin, 1968) that the choice of any of these functions for part of the 
QDF requires a certain correspondence rule (the rule of constructing quantum 
operators), which does not coincide with the rule (Neumann, 1932) used in 
quantum mechanics. In other words, the proposed QDF should be treated as 
no more than phase-space representation of the density matrix (Imre and 
coworkers, 1967; Kuryshkin, 1969a, b; Ruggery, 1971; Gorshenkov and 
Kognkov, 1973). Finally, in 1966 it was proved (Cohen, 1966a, b) that in the 
generally accepted quantum mechanics, whose operators satisfy Neumann's 
requirements (Neumann, 1932; Shewell, 1959) the QDF was non-existent, not 
only the non-negative QDF but the sign-variable one as well. Thus, the concept 
of the joint coordinate-momentum probability density cannot be introduced 
into generally-accepted quantum mechanics, i.e. the generally accepted quan- 
tum mechanics cannot be reduced to a consistent probability scheme. This 
conclusion was also formulated and discussed in a number of other works (De 
Broglie, 1964; Andrade e Silva and Lochak, 1969; Kuryshkin, 1974). 

Thus, the generally accepted quantum mechanics compels us: (a) either to 
reject the assumption of the existence of a theory that can provide a more 
complete description of physical reality or (b) assuming that such a theory does 
exist, to question the validity of the generally accepted quantum mechanics 

64 Uncertainty Principle and Foundations of Quantum Mechanics 

Therefore, while favouring the search for a profound theory, it is necessary in 
the first place to reconsider the generally accepted quantum mechanics altering 
it so as to introduce into it a non-negative QDF interpreted as the joint 
coordinate-momentum probability density. Such alterations, naturally, must 
not lead to the violation of those propositions that can be experimentally 
checked. Heisenberg's uncertainty principle, which is a fundamental and 
indispensable law of quantum theory begins to play a very important part in this 
case. This is because correlation (1) forbids the physical system states with 
coordinate and momentum strictly determined, while any attempt to introduce 
a joint coordinate-momentum probability into the quantum theory is equiva- 
lent to an implicit assumption that a physical system can possess a definite 
momentum with a quite definite coordinate. 

The principal possibility of altering quantum mechanics with the view of 
introducing QDF into it was shown in the works of the author of this paper 
(Kuryshkin, 1971, 1972a). Such alteration was based on the fact that the 
problem of constructing operators 0(A) of physical values A in quantum 
mechanics has not been completely solved. 

The generally accepted quantum mechanics makes use of operators, satisfy- 
ing a set of requirements, called the Neumann rule (Neumann, 1932; Shewell, 
1959). However, as far back as 1935 it was shown that this rule is not 
single-valued and the attempts to get rid of that disadvantage lead to inner 
contradictions (Temple, 1935a, b). Other known correspondence rules (Born 
and Jordan, 1925; Dirac, 1958; Weyl, 1950; Tolman, 1938; Rivier, 1951; 
Yvon, 1946; Kuryshkin, 1968; Kerner and Sutsliffe, 1970) also suffer from a 
number of drawbacks (Shewell, 1959; Groenewold, 1935; Kuryshkin, 1969b; 
Cohen, 1970). It must be noted that all known correspondence rules, generally 
speaking, agree only in the statement that: 

0(q,)0(p r ) - 0{p r )0(qj) = ih 8 ir (2) 

Commutator (2) in the long run results in correlation (1). 

The works criticizing the well-known correspondence rules made it possible, 
to formulate a number of requirements to the 'uncontradictory' rule and to 
construct it (Kuryshkin, 1971). The application of this rule to the construction 
of quantum operators has led to a theory, named 'quantum mechanics with a 
non-negative QDF' (Kuryshkin, 1972a). To date this theory has been studied 
fairly fully (Kuryshkin, 1972b, c, 1973; Zaparovany, 1974; Zaparovany and 
coworkers, 1975). 

In this paper therefore, we will mostly concentrate on the principles of 
constructing theories of a quantum mechanical type possessing a non-negative 
QDF as well as on a brief analysis of certain concepts distinguishing these 
theories from the generally accepted quantum mechanics. 

It should apparently be stressed once again, that our concern will not be to 
offer another interpretation of the generally accepted quantum mechanics but 
tp construct some new statistical theory which will comprise a major part of the 
quantum mechanical mathematical formalism and which will have a consistent 

Kuryshkin 65 

probability character, and can, therefore without contradiction be considered 
as a statistical theory for the would-be more profound theories. 

In order to pay maximum attention to the physical sense and not to burden 
our paper with a lot of mathematical formulae, we will consider one-body 
physical systems only [coordinate r = (r u r 2 , r 3 ), momentum p = (p u p 2 , p 3 )] 
and pure quantum-mechanical states represented by vector i/>. The task of the 
generalization of everything that follows for the case of many-body systems as 
well as for mixtures represented by density matrix p, presents no difficulties 
(Kuryshkin, 1972a, 1973; Zaparovany and Kuryshkin, 1975). 


In order to construct the most general class of statistical theories resembling 
quantum mechanics by their mathematical formalism and containing the 
non-negative joint coordinate-momentum distribution, treated as a phase- 
space probability density, let us proceed from the assumptions. 

1. Interpretation Postulate 

The state of a physical system at any instant of time t is completely described by 
a joint coordinate-momentum probability density F(z, p, t) ; the physical quan- 
tity A can be represented by coordinate-momentum-time functions A (r, p, i) 
and the experimentally observable value (A) of the physical quantity A for a 
system in an F-state is defined as: 

<A) = J A(r,p,t)F(r,p,t)drdp 


From the physical meaning of F, defined by this postulate, follow its essential 

F(r,p,t)drdp = l 
F(r, A f)>0 


This postulate is only a common statement of the classical statistical theory 
and in the case of S -like distributions, of classical mechanics as well. However, 
considering the coordinate and the momentum as simultaneously existing 
physical realities, we extend this statement to physical systems possessing 
quantum properties. The question of the compatibility of this postulate with 
the general property of quantum systems, expressed by Heisenberg's uncer- 
tainty principle, remains open so far. 

66 Uncertainty Principle and Foundations of Quantum Mechanics 
2. Mathematical Formalism Postulate 

The state of a physical system at any instant of time t is completely described by 
a normalized vector |i/Kf)) °f some states' space if; any function of phase 
space and time A (z, p, t) owing to a certain linear rule can be represented by a 
linear operator 0(A) in if; and the experimentally observable value (A) of the 
physical quantity A for a system in a |t/r) state is defined as: 

<A> = <*(0|O(A)|*(0> 


where (t/'ili/^) is a scalar product of vectors j«^i> and |i/f 2 ). The normalization 
requirement for the vector state, the linearity of operators and the linearity of 
the correspondence rule is understood as usual: 


O(A){a|0» = aO(A)|*> 

0(A){\^) + |e/r 2 >} = 0(A)|<h> - 0(A)|<fe> . 

o(i) = i 

0(aA) = aO(A) 

0(A 1 +A 2 ) = 0(A l ) + 0(A 2 ) 



where a is a numerical coefficient, 1 is the unit operator in if. The second 
postulate is practically a slightly paraphrased basic postulate of quantum 
mechanics. But in contrast to the generally accepted quantum mechanics the 
distinct forms and properties of the operators (with the exception of linearities 
(4c) and (4d) remain undefined here. 

It is essential in the first place to prove the compatibility of the above- 
formulated postulates, i.e. to show that equations (3) and (4) do not contradict « 
each other. With this purpose in view let us introduce into consideration the 
characteristic function 

F(«,M) = (2ir) 6 \F(r,p,t)e 


dr dp 


Expanding the exponent into a series and using correlation (3a) the characteris- 
tic function can be rewritten in the form: 

F{u, v, t) = (2tt) 2, ■ - (r j 1 


m 3 ! 

'P? 3 ) (6) 

where n = {n u n 2 , "3), m = (mi, m 2 , m 3 ) are integer vectors. Reconstructing 
now the probability density F from the characteristic function F by the reverse 


Kuryshkin 67 

transformation of the integral (5) and using equations (6) and (4a), we obtain: 

F(r,p,t) = (2TrT 6 Z \dudve Hur+vp) 

n,m J 

(-;«!)"• • . . . • -(i» 3 ) 

n^ • ...-m 3 ! 



Taking into account linearity properties (4c) and (4d), equation (7) can be 
rewritten as: 

F(r,p,t) = Mt)\fi(r,p,t)\*(t)) 


where F(z, p, t) is linear in the 5£ operator and parametrically dependent on 
coordinate, momentum and time. The form of operator F, or rather its relation 
to operators 0(r" ' • . . . • p? 3 ), is defined by equations (7) and (8). The depen- 
dence of F on time in a general case is caused by the fact that the mathematical 
formalism postulate does not rule out the possibility of the time-dependence of 
operator 0(A) even if the corresponding function A (r, p) does not depend on 

Let us turn now to the physical meaning of operator F and its properties. 

Substituting relation (8) into equation (3 a) and comparing it with (4a) we will 
see that operator F completely determines the correspondence rule: 

0(A) = J A ( r, p, t)F(r, p,t)dr dp 


Integrating equation (8) over phase-space and taking into account the 
normalizations (3b) and (4b) we obtain: 

^F(r,p,t)drdp = l 


At last from property (3c) and relation (8) it follows that: 

<*|Ar,p,f)l*>2=0 (ID 

i.e. F is an operator positively determined in if. 

Thus, a theory, satisfying the two initial postulates, contains a linear operator 
F( r, p, t), positively determined in if, parametrically dependent on coordinate, 
momentum and time, and normalized by condition (10). 

The physical meaning of operator F is defined by relation (4a) of postulate 2 
and equation (8), i.e. F(r, p, t) is the operator of the probability density of 
coordinate r and momentum p at the instant of time t. 

One can easily determine the phase-space function /(r, p, t) corresponding 
to the operator F in agreement with the correspondence rule (9). Indeed, 
writing down the probability density operator as 

F(£, v ,t)=0(f(t,r 1 ,r,p,t)) (12) 

where $ and 17 are the parameters of operator F and the function f(r,p,t), from 

68 Uncertainty Principle and Foundations of Quantum Mechanics 

relations (7) and (8), determining operator F, we obtain: 

f(€,V,r,p,t) = S(€-r)8(ri-p) (13) 

Here 8(x) is Dirac's three-dimensional 8 -function. Hence, the coordinate- 
momentum probability density operator corresponds to the phase-space 8- 
function. This conclusion is in full agreement with correspondence rule (9) and 
the physical meaning of operator F. 

Let us consider now the problem of the dynamics which are permissible in a 
theory which satisfies the initial postulates. 

Let \<p{t)) and |<K0) determine the states of a physical system at times t and 
t ' s t, respectively. Since, in conformity with postulate 2 both these vectors 
belong to the same space !£, they can always be related by the transformation: 

|W)> = $(*', *)!*«>, S(t,t) = l 


where S(t', t) is a linear operator, parametrically dependent on t' and t. 
Assuming in (14) that t' = t + 8t, in the limiting case when St -*■ 0, we obtain: 





where X(t) is the linear operator parametrically dependent on t and related to 
operator S by the following equation 

X(t) = 

dS(t', t) 


The fact that the permissible dynamic equation (15) contains only the first 
derivative of the vector state with respect to time is an immediate consequence 
of postulate 2: vector |0(f)) completely determines the state of a system and 
the knowledge of it is a sufficient condition for finding the vector state |^(r')> at 
any instant of time t' > t. 

Since the operator X determines the evolution of the system in time it must 
be related to certain physical quantities. And since X is linear [the consequence 
of postulate 2] and since any physical quantity can be represented by a 
coordinate-momentum-time function [the requirement of postulate 1], 
operator X by the correspondence rule (9), is related to a certain phase-space 
function, which can, in a general case, be complex i.e.: 

X=0(X), X(r,p,t) = Q(r,p,t)-iR(r,p,t) (16) 

Here Q and R are real functions. 

Let us take into account now the normalization condition (4b), Since the 
normalized vector |t/r(0> due to equation (15) must automatically result in the 
normalized vector \>p{t)), we have: 



-=<*|*|*>+<*|tf» = 

Kuryshkin 69 

Since the initial state |t/r(0> can be arbitrarily chosen, it follows that 

0(X) = -0 + (X) (17) 

where + {X) is an operator in <£ conjugated to operator 0(X). Since the 
correspondence rule (9) is linear and owing to the properties of operator F (i.e. 
it gives self -conjugated operators for real functions) by substituting (16) into 
(17) we obtain: 

0(Q) = 0, -iO(R) = 0(X) (18) 

The dynamic equation (15) will in this case take the form: 




where R(r, p, t) is a real function. 

Hence, a theory, satisfying the two initial postulates, contains a real 
coordinate-momentum-time function R(r,p, t) (a dynamic function) which, 
with the help of the corresponding operator O(R) and equation (19), defines 
the evolution of a physical system in time. 

By differentiating (8) with respect to t and using equation (19) it is possible in 
principle to obtain an equation for the probability density F(z, p, t) as well. 

It will also contain only the first time derivative which agrees with postulate 
1 . But in order to determine the distinct form of this equation one must be able 
to reconstruct the function R (/-, p, i) from the operator 0(R), i.e. to knowthe 
distinct form of the coordinate-momentum probability density operator F. 



The results (9)-(19) of the investigation of the compatibility of the two 
postulates made in the previous section make it possible to formulate the 
general principle of constructing the theories in question. 

In order to construct a quantum mechanic-like statistical theory with a 
consistent probability interpretation it is necessary and sufficient to use the 
following procedure: 

(1). To represent the physical quantities as coordinate-momentum-time 
functions A(r,p,t) (20) 

(2). To choose a space !£ of the physical system's vector states \\$>) (21) 
(3)- To indicate a linear probability density operator F(z, p, t) positively 
determined in !£ and parametrically dependent on coordinate, momen- 
tum and time and normalized by the condition (22) 

F(r,p,t)drdp = l 


70 Uncertainty Principle and Foundations of Quantum Mechanics 

(4). To indicate a real dynamic function R(r,p, t) of the phase-space and 
time, responsible for the evolution of the system (23) 

The necessity of some solution of the above-listed problems was shown in the 
previous section. Its sufficiency can easily be demonstrated in the following 

Assume that the problems (20)-(23) are in a certain way solved. Then 
bringing into correspondence to any function A (r, p, t) the linear in ££ operator 

0(A) = J A (r, p, t)F(r, p, t) dr dp 


and representing the physical state of the system by a normalized vector 
|(H0) e ££ satisfying the equation 



let us determine the values (A ) of the physical quantity A in the state \i(t) as the 
scalar product 

<A> = <*|0(A)|*> (26) 

Correlations (24)-(26) now represent a quite definite enclosed theory both 
from the point of view of statistics and of dynamics. The correspondence rule 
(24) and operators 0(A) of such a theory possess the linear properties (4c) and 
(4d). A mere substitution of operators (24) into equation (26) results in the 
redetermination (3a) of the physical quantity values; defines the function 
F(r,p, t) in accordance with correlation (8) and its properties (3b)-(3c); and 
therefore the only possible in such theory interpretation, i.e. F(r, p, t) is 
the coordinate-momentum probability density. Finally, knowing the 
coordinate-momentum probability density F(r,p,t) and the function rep- 
resentations A(r,p, t) of physical quantities in this theory it is possible to 
calculate joint and conditional probabilities for any physical quantities by the 
conventional methods of the probability theory. The theory so obtained is 
consequently of a consistent probability character. 

It is quite natural then, that the properties of the statistical theory thus 
obtained, and its results in the first place, will depend on the concrete solution 
of the problems (20)-(23). The natural questions arising from it are: What is the 
difference between these theories and the generally accepted quantum 
mechanics? Can this theory with some concretization of the problems (20)-(23) 
describe physical reality? And if so, in what way can this concretization be 

In the sections that follow, we shall try to discuss these questions, omitting 
for brevity's sake all mathematical calculation. 




The main advantage of the statistical theory, whose methematical apparatus 
was given in the previous section is that it is of consistent probability character 


Kuryshkin 71 

and at the same time does not violate the basic postulate of the quantum theory. 
The concrete form of such statistical theory, its properties and its results 
depend on the a priori solutions of the problems (20)-(23). However, irrespec- 
tive of this concretization a number of general theoretical consequences can be 
pointed out, amongst which are the following: 

(1). No concretization of problems (20)-(23) makes this statistical theory 
coincide with the generally accepted quantum mechanics. This is quite 
obvious, since the generally accepted quantum mechanics would other- 
wise be reduced to the consistent probability scheme. 

(2). In the statistical theory under consideration Neumann's requirement is 
in a general case violated, i.e. 


The operator of the square of a physical quantity is not equal to that 
quantity's operator squared which can be written as: 

0(A 2 ) = 2 (A) + 2(A) (27) 

The linear operator 2(A) defined by the relation (27) depends on the 
concrete form of the probability operator F and, in a general case, is not 
equal to zero for all physical quantities. 
(3). In the theory proposed here, as in any statistical theory, the physical 
quantity's value (A) in the state |«A> is characterized by the uncertainty 

((AA) 2 ) = <A 2 )-«A)) 2 = <(A-<A» 2 ) = <(O(A)-<A» 2 >+<®(A))>0 


whose non-negativeness is guaranteed here by the consistent probabil- 
ity character of the theory. In contrast to the generally accepted 
quantum mechanics, however, the value (A) of the quantity A here in 
the states with an eigenvector of O (A) in a general case is not strictly 
determined. Thus, if 

0(A)|^> = «|^> (29) 

where a is the eigenvalue, coinciding with (A), then from (28) it follows 


{(AA)X = bl> a \2(A)\+ a )^0 (30) 

Hence, over the eigenvectors of operator 0(A) the operator 2(A) is 
non-negative and has the sense of a dispersion operator. 
(4). If in a certain state |«^> the dispersion of quantity A reaches its minimum 
value in the sense that 

<(AA) 2 W><(AA)% 
where |S^> is an arbitrary infinitesimal deviation of the vector state, then 

72 Uncertainty Principle and Foundations of Quantum Mechanics 

|(/f) satisfies the non-linear equation 

{0(A 2 )-2aO(A) + a 2 }\^) = d 2 \tlf) (31) 

wherea = <(^|0(A)|iA>. 
(5). In the statistical theory under consideration the precision of determin- 
ing the value of even a single physical quantity is limited by the 

((AA) 2 )>(8A) 2 = mm {dl} (32) 


where d 2 n are the eigenvalues of equation (31). The uncertinties 8A are 
finally determined by the probability density operator and may change 
with time. 
(6). For the uncertainties of two physical quantities A and B in any state \iff) 
there exists a correlation 

{<(AA) 2 >-<2>C4)>} • {<(A2?) 2 >-<®(B))}>||<C>| 2 


where C=[0(A), 0(B)]. Inequality (33) represents the uncertainty 
principle in the proposed statistical theory. 
(7). In the case when 

[O(A z ),O(A)]_ = 


is valid for a quantity A, the eigenvectors of equations (29) and (31) 
coincide. Therefore, providing that equality (34) is fulfilled the states 
with the most precise values of a quantity A (minimum dispersion) are 
defined, as in the generally accepted quantum mechanics, by the 
operator 0(A) eigenvalue equation. If the commutation condition (34) 
is fulfilled both for quantity A and for quantity B, the uncertainties 
correlation (33) takes the form: 

<(AA ) 2 )((AB) 2 ) ^|<£>| 2 + (8A f(8B) 2 (35) 

(8). In a similar way to the generally accepted quantum mechanics all 
probability characteristics of a physical system in the theory investigated 
are determined by the state \tff(t)), the probability density of any physical 
quantity in the state |^(f)> being given by the expression 

W(A, t) = <*(*)| I 8(A -A(r, p, t))F(r, p, t) dr dp|*«> (36) 

Here, however, the vector \\fi) does not generally have a distinct physical 
sense and can be considered as only a mathematical image of the 
probability density F(r, p, t) carrying all its probability information. 
(9). The condition for conserving the value (A) of a physical quantity A in 
time (the conservation law for quantity A ) is formally the same as in the 
generally accepted quantum mechanics: 


= 0, if ^^ = i[0(U),0(A)]_ 



Kuryshkin 73 

However, the fulfillment of this condition with R and A fixed essentially 
depends on the distinct form of the operator F(r, p, t). 
(10). The proposed theory in a general case results in concepts, that have no 
analogue in the generally accepted quantum mechanics. They will be 
further named 'subquantum' concepts. Among them one could name 
the 'subquantum' uncertainty 8A of quantity A, which limits the 
precision of determining the value (A) of this quantity. 


According to the principle of constructing the statistical theories in question, 
which was formulated in paragraph 2, it is necessary above all to solve problem 
(20), i.e. to define physical quantities as certain coordinate-momentum-time 
functions. There obviously exist only two methods of such definition which 
divide the multitude of the theories under investigation into two classes: (a) all 
A(r, p, i) coincide with and (b) all A(r, p, t), or at least some of them, differ 
from the classical ones. 

The second method involves considerable difficulties (Tyapkin, 1968; 
Zaparovany and Kuryshkin, 1975) since a constructive approach to the choice 
of such functions, with the exception of Tyapkin's condition (A (r, p, t) at ft -» 
turn into classical ones) has not been found as yet. 

We shall assume, therefore, at least in this paper, that the function depen- 
dence of all physical quantities A of r, p and t, is given by functions A (r, p, t), 
representing the same quantities in the classical theory. 


If, while finding a solution to the problem (20), we made use of the analogy with 
the classical theory, it would seem quite natural to use the analogy with the 
generally accepted quantum mechanics when choosing the states' space. 

Restricting ourselves (in this paper) to the consideration of non-relativistic 
theories alone, let us define i? as the space of scalar square integrable functions 
of coordinates, i.e. 



|*(f)> = *(M), <*(0l = **(M) 
<*i Wl*2(*)> = } <l>*(r, M 2 (r, t) dr 

For the sake of convenience in further investigations let us represent each 
operator 0(A) by a generation function A G (r f p, t), related to 0(A) with the 
help of transformations: 

A G (r, p, t) = e -° /a)rp O(A)e (i/a)rp (39a) 

0(A)U(r, t) = (27ra)" 3 j A G (r, p, t)e 0/aXr - r ' )p U(r', t) dr' dp (39b) 

74 Uncertainty Principle and Foundations of Quantum Mechanics 

where a is constant. Equality (39a) defines the generation function of operator 
0(A), while (39b) reconstructs the operator when the generation function 
A G (r, p, t) is known. 

Correlations (39) also bring in correspondence to the probability density 
operator F(f , 17, t ) some generation function / G (£ , 17, r, p, t) where £ and 17 are 
parameters of the probability operator and its generation function. Then, from 
the correspondence rule (24), follows a connection of the generation functions 
of operators 0(A) with the generation function of the probability density 
operator F: 

A a (r, p, t) = J A& r,, *)/o(6 V, r, p, t) d{ d V (40) 

It can be shown that, owing to the positiveness of the probability density 
operator and its normalization (22a), the generation function f G can always be 
written as 

/G(£*?,r,P,0 = Le- (,/a "V fc 

■Mv,r,t)\ ' 

k (i/a)r'p * 

V%(t, v ,r',t)dr' (41) 

where fi K (r, p, £, t) is a certain set of functions of the phase-space (r,p), an 
additional configuration space £ and time t, satisfying the normalization: 

I f » K (r,p,t,t)n%(r,p,£',t)drdp = 8(£-{') (42) 

K J 


Accepting the above concretizations of the functions A (r, p, t) and the states' 
space if, the whole of the statistical part of the theory under investigation is 
defined by a set of functions MK( r > Pi £ satisfying normalization (42). The set 
of functions fi K determines the generation function f G of the probability 
density operator [see expression (41)] and the operator F (39b). In point of 
fact the operator F itself can be dispensed with since, with the set of functions 
fj. K fixed, the operators of all physical quantities are singularly determined by 
relations (39a), (40) and (41). 

It should be noted that the functions p K themselves have no analogue either 
in classical or in the generally accepted quantum mechanics, i.e. in accordance 
with the above-accepted terminology, are 'subquantum' notions. The values of 
all 'subquantum' quantities, appearing in the theory ('subquantum' uncertain- 
ties (32), for instance) are determined by a set of functions fj. K . It is therefore 
suggested that one should say that the set of functions ix K represents a certain 
'subquantum situation' in the theory under investigation. Thus, the same 
physical system can be considered at various 'subquantum situations' (various 
sets of fi K ) in the proposed statistical theory and vice versa, different physical 
systems can be considered at one and the same 'subquantum situation' (a fixed 
set of fi, K )- 

Kurysnkin 75 

The choice of a 'subquantum situation' (a certain set of functions fi K ) gives a 
single-valued definition of all operators, and, determines, consequently, the 
results of the theory. A change of the 'subquantum situation' leads to a change 
of the whole set of results. Therefore, assuming the correctness of the statistical 
theory investigated here, we are compelled to acknowledge that a 'subquantum 
situation' reflects a certain physical reality, which has no analogue either in 
classical or in the generally accepted quantum theory. 

In a general case, a 'subquantum situation', as a physical reality, can change 
both in time and in space. For instance, together with the unconditional 
'subquantum' uncertainty of the coordinate Sr depending in a general case on t, 
one can consider a conditional 'subquantum' uncertainty of the coordinate 

Sr(r , t)= Vmin {((A r) 2 )«U 



where t// zo are all possible states with (/■)= r . The coordinate's 'subquantum 
uncertainty' (43) is also determined by the 'subquantum situation', but it may 
depend not only on time, but on the system's location r in space as well. This 
means that there exists a possibility of the space heterogeneity and anisotropy 
of the 'subquantum situation'. 

Note should be made, however, that the 'subquantum situation' in the 
statistical theories under investigation is given by a set of functions fi K only with 
the above-accepted concretizations of the functions A{r,p, t) and the states' 
space if. In a general case the 'subquantum situation' is given by the solution of 
the whole set of problems (20)-(23). It is essential, that the concept of a 
'subquantum situation' is an indispensable part of the quantum mechanics-like 
statistical theories, possessing the joint coordinate-momentum probability 


Since in the present paper our task is only to make a brief analysis of the 
possibilities of the statistical theories obtained, we will hence forward make use 
of the simplest 'subquantum situation', given by a set of functions 

Kir, P, i. t) = (2nar /2 <p K (r -fe fle""* (44) 

where <p K (r, t) is an arbitrary set of squarely integrable functions, satisfying the 

lfk*(M)| 2 dr = l 

K J 


A mere substitution of functions (44) into integral (42), with equality (45) taken 
into account, shows that the set of functions (44) possesses the required 


76 Uncertainty Principle and Foundations of Quantum Mechanics 

Now the coordinate-momentum probability density operator is determined 
with an accuracy to an arbitrary set of coordinate-time functions <p K (r, t) 
normalized by condition (45). 

The concretization (44) of the probability density operator is all the more 
significant because here the commutator of operators O^r,) and Oipj) does not 
depend on the distinct form or the number of functions <p K : 

[0(r,),0(p r )]- = "*«,;' (46) 

The commutator (46) follows from relations (39b), (40), (41), (44) and (45). 

Besides 'subquantum situation' (44), in a particular case, can be stationary, 
space-homogeneous and isotropic. For this it is enough to choose: 

<pAr,t) = <p K (\r\) (47) 

The set of 'subquantum' functions (47), owing to relations (39b), (40), (41) 
and (44), results in a theory which is invariant with respect to time-shifts, 
translations and rotations of the space. 


In the previous sections the correspondence principle of the statistical theory 
investigated here with the classical and generally accepted quantum mechanics 
was used for the concretization of the functions A (r, p, t) and the states' space 

Since the 'subquantum situation' has no analogue in the indicated theories, 
the probability density operator F so far remains determined with the accuracy 
to a normalized set of functions <p K (\r |) and the quantity a, present in correla- 
tions (39b), (41) and (44). 

However, in spite of this uncertainty of the theory, the problem of the choice 
of a dynamic function R(r,p, t) (23), due to the correspondence principle, has 
been definitely solved. 

With the accepted concretizations of A (r , p, t) and SB for the coordinate and 
momentum uncertainties in any state i/t we have: 


((Ar,) 2 )^/) 2 , <(Ap ; ) 2 >>(Sp) 2 
<(Ar ; ) 2 ><(Ap,.,) 2 > > j 8ff + (fir ) 2 (Sp) 2 


where the 'subquantum' uncertainties Sr and Sp are the functionals of a set of 
functions <p K (Kuryshkin, 1972a, 1972b, 1973). 

Correlations (48) determine the conditions of the transition of the statistical 
theory under investigation into the classical theory. Since the classical theory 
allows F-distributions with the coordinate and the momentum precisely deter- 
mined the conditions for such a transition will be: 

fir-»0, fip-*0, a-*0 


Kuryshkin 77 

Differentitating the probability density F(r, p, i) (8) with respect to time and 
using the evolution equation (19) after performing the limit transformation we 
come to the following conclusion (Kuryshkin, 1972b): the statistical theory 
under consideration satisfies the correspondence principle when, and only 

aR(r,p,t) = H(r,p,t) (50) 

where H(r, p, t) is the system's Hamiltonian. 

Correlations (48) also determine the conditions for the possible transition of 
the statistical theory under investigation into the generally accepted quantum 
mechanics. These conditions obviously are: 

fir-*0, 8p->0 (51) 

Comparing now commutator (46), correlation (48b) and equation (25) under 
conditions (50) and (51) with commutator (2), used in the generally accepted 
quantum mechanics, Heisenberg's uncertainty principle and the Schrodinger 
equation we come to the conclusion: 

a = ft (52) 

Thus, the principle of the correspondence of the statistical theory under 
consideration with the generally accepted quantum mechanics requires that 
quantity a which is present in correlations (39b), (41), (44) and (50), coincide 
with Planck's constant. 


The concretization of the functions A (r, p, t ), the states' space <£, the probabil- 
ity density operator F(r, p, t) and the dynamic function R(r,p, t) introduced in 
the previous sections, results in a particular case of the quantum mechanic-like 
statistical theory with the consistent probability interpretation. The 'subquan- 
tum situation' in this theory is given by a set of functions <p K (\ r |), normalized by 
the condition: 

lfk*(|r|) 2 r = l (53) 

The operators of physical quantities are defined by the correspondence rule 

0(A)U(r, t) = (27rhT 3 J <p(r -£,p-i,)A(£, ij, t) 




where l/(r, t) is an arbitrary coordinate-time function, A(r,p, t) is a phase- 
space and time function, corresponding to quantity A in the classical theory, ft 

78 Uncertainty Principle and Foundations of Quantum Mechanics 

is Planck's constant, <p(r,p) is an auxiliary function related to 'subquantum' 
functions <p K (\r\) by correlations: 


V (r, p) = (2TThT 3/2 e - iimrp I <p K <\r\)tf<\p\) 
<p K (\p\) = (2TThy 3/2 \e- m)rp <p K (\r\)dr 


The physical system's state in such a theory is described by the vector (wave 
function) <p(r, t), normalized by the condition 

J|*(M)| 2 dr = l (57) 

and satisfying an equation of the same type as the Schrodinger equation 

ih »tM=oW*(r.t) (58) 


where H(r,p, t) is the system's Hamiltonian function, 0(H) is the operator 
corresponding to it in accordance with rule (54). The value (A) of the physical 
quantity A in the t/r-state is determined by the formula: 


<A> = J**(M)O(A)*(r,0d/ 

The mathematical formalism of this theory's particular case given by for- 
mulae (53)-(59), immediately follows from relations (24)-(26) with the 
accepted concretization of the function A(r,p, t) and equalities (38), (39a)- 
41), (44)-(45), (47), (50), (52) taken into account. 

A mere substitution of operators (54) into formula (59) involving the 
auxiliary functions (55), (56) and normalizations (53), (57) results in: 




F{r, p, t) = (2nhV 3 1 | } <p%(\r - Z\fiT m)fp *& d£ 
JF(r,p,0drdp = l 

Correlations (60) determine the consistent probability interpretation of the 
theory. The equation for Fir, p, t) can be obtained from equation (58) with the 
help of correlations (54), (55) and (60b) (Kuryshkin, 1972c). 

As has been noted above, the generally accepted interpretation of the wave 
function $ cannot be accepted here since 

W{r , t) = | F{r, p, t)dp = \ \M€, tf I \<PK<\r - i\f <* (61) 

i.e. the square of the modulus 0(r, t) determines, but by no means coincides 
with, the coordinate probability density Wir, t). 


Kuryshkin 79 

The statistical theory as represented by equations (53)-(59) does coincide 
with the 'quantum mechanics with the non-negative QDF (Kuryshkin, 1972c) 
in the case of the stationary, homogeneous and isotropic 'subquantum situa- 
tion' (Kuryshkin, 1973). The general theoretical concepts of this theory are at 
present fairly well studied (Kuryshkin, 1971, 1972a, 1972c, 1973; 
Zaparovany, 1973; Zaparovany and coworkers, 1975). The solution of actual 
problems within the framework of the mathematical formalism (53)-(59) yields 
quite satisfactory results (Kuryshkin, 1972c, 1973; Kuryshkin and 
Zaparovany, 1974; Zaparovany and coworkers, 1975). 

Thus, equation (58), for example, results in the energy spectrum of a 
one-dimensional harmonic oscillator 

E n = ftw(n+2) + e, n=0, 1, . 


where e > is the 'subquantum' energy, related to the 'subquantum' uncertain- 
ties of coordinate Sr and momentum Sp. e does not affect the level-difference 
and is not therefore experimentally observable. The calculation of an oscillator 
average energy in a thermostat results in Planck's formula with the vacuum 
energy increased by e. 

A similar problem for an electron in a hydrogen-like atom in a second-order 
approximation with respect to the coordinate 'subquantum' uncertainty Sr 
gives the energy spectrum: 




2 + T + £ n l 


n = 1, 2, . . . ; / = 0, 1, . . . , n - 1; m = -/, . . . , 0, . . . , /. Here T s is the 'sub- 
quantum' kinetic energy related to Sp, and e n/ > is the 'subquantum' energy 
shift, connected with Sr and stipulating the split of the levels over /, resembling 
Lamb's shift. The result (62b) agrees with the experimental data when 

Sr = 4.247 xlO" 12 cm (63) 

Energy levels (62) in contrast to the generally accepted quantum mechanics 
are not strictly determined. The dispersions of the levels, however, can be 
calculated when some real functions <p K are chosen. 


Even in the particular case of the theory, analyzed in the previous section and 
permitting the solution of some concrete problems, the concretization of F is 
determined with the accuracy to a set of 'subquantum' functions <p K . It is quite 
clear, that a further concretization of operator F is out of the question in the 
absence of some kind of an assumption concerning the physical nature of the 
'subquantum situation'. 

80 Uncertainty Principle and Foundations of Quantum Mechanics 

The arbitrary choice of 'subquantum' functions <p K , however, can be partially 
eliminated by comparing the results of the theory with the experimental data. 
Thus, for instance, condition (63) obtained as a result of such comparison 
considerably reduces the arbitrariness in the choice of the 'subquantum 
situation'. One more opportunity of reducing the arbitrariness in the choice of 
functions can be pointed out. Thus, experiment shows that the dispersion of 
energy levels is either zero at at least very small. One may demand, therefore, 
that the eigenfunctions of the operator 0(H) coincide with the eigenfunctions 
of the minimum uncertainty equation (31) when A = H. This is possible only 
with the commutation of the type (34), i.e. 

[O(H\O(H)]- = 


Obviously, at H fixed, equality (64) limits the choice of functions <p K . 
Assuming that condition (64) is fulfilled, one can estimate (qualitatively at 
least) the energy level dispersions (62). Calculations show: 

(A£) 2 )„ = C+2e£„, 
<(A£) 2 )„ /m = 2 +4T (E nlm - To) s 


where 'subquantum' quantities C, e, 2 and T remain non-negative functionals 
of the set <p K , i.e. the additional concretization (64) and (63) of the probability 
density operator is not sufficient for a single-valued calculation of dispersions. 
However, equations (65) allow us to see the qualitative picture for the 
dispersion change relative to the energy-level increase 

<(A£) 2 >o=C+eo)ft, <(A£) 2 )„^co^oo 

2 2 

((AE) 2 )ooo = 2- 2T °^ e a=0, ((AE)\^ M ^2 

Thus, the minimum uncertainty for both the oscillator and the electron in a 
hydrogen-like atom is inherent in the ground-state (minimum energy) level. 
Hence, the additional concretization of F, established by equality (64) leads to 
quite satisfactory qualitative results of the theory. 


The investigations, the results of which are set forth in the present Chapter, 
makes it possible to conclude as follows: 

(1). There exist a multitude of theories satisfying the principal postulate of 
quantum mechanics and permitting a statistical interpretation on the 
basis of coordinate-momentum probability density. These theories 

Kuryshkin 81 

differ from each other in the concretization of physical quantities as 
functions of phase-space and time, states' space, probability density 
operator and the dynamic function. The generally accepted quantum 
mechanics does not belong to their number. 

(2). Any specific theory out of the multitude of theories under consideration 
is of a consistent probability character. The existence of the coordinate- 
momentum probability density F(r, p, t) in such a theory and the 
functional relations A(r,p, t) of the physical quantities A with coordi- 
nates and momenta permit the calculation of all sorts of joint and 
conditional probabilities by the conventional methods of probability 

(3). Irrespective of the concretization, the theories in question bring into 
existence certain concepts ('subquantum' uncertainty, 'subquantum 
situation', etc.), which have no analogue either in the classical or in the 
generally accepted quantum mechanics. 

(4). There exists a theoretical concretization which leads to quite satisfac- 
tory results. In one particular case of this concretization the statistical 
theory in question turns into the classical and partly (in the realm of 
physical quantities containing no products of the similar components of 
coordinate and momentum) into the generally accepted quantum 

(5). Violating Neumann's requirement for quantum operators, the theory in 
question is not subject to his theorem on the impossibility of 'hidden' 
parameters. Moreover, such statistical theory requires the introduction 
of some new physical concepts for the explanation of the physical nature 
of the 'subquantum situation'. 

(6). In the statistical theories under consideration the concept of the uncer- 
tainty of physical quantities acquires a more general character than in 
accepted quantum mechanics. The correlation of the coordinate and 
momentum uncertainties (48b) which is nothing but Heisenberg's 
uncertainty principle reinforced by the 'subquantum' uncertainties is 
true in the particular concretization of the theory. 

(7). The utilization of the joint coordinate-momentum probability density 
in the theory under investigation is equivalent to the assumption that a 
physical system always possesses quite definite coordinate and momen- 
tum. The uncertainty principle of the type of tKe Heisenberg principle, 
therefore, is not in contradiction with the concept of the coordinate and 
the momentum existing simultaneously as physical realities. 

(8). The existence of the uncertainty principle in the statistical theory under 
investigation and the fact that the coordinate and momentum can be 
considered as simultaneously existing physical realities signify that this 
theory does not pretend to be a complete description of physical reality. 
In other words, the proposed statistical theory assumes the existence of 
a more profound, more deterministic theory, capable of also explaining, 
among other things, the physical nature of the 'subquantum situation'. 

82 Uncertainty Principle and Foundations of Quantum Mechanics 

The author wishes to express his most sincere gratitude to Professor L. de 
Broglie, Professor Ya. P. Terletsky, Professor J. Lochak and the participants of 
the seminars at the Peoples' Friendship University (Moscow) and the Henry 
Poincare Institute (Paris) for numerous and helpful discussions of the problem 
investigated in this paper. 


Andrade e Silva, J. L. and Lochak, G. (1969) Quanta, Grains et Champs, L'Univers de connais- 

sances, Hachette, Paris. ,,<.<■„ c- xt„ < n-j 

Arinshtein, E. A. and Guitman, D. M. (1967) Izvest. Vusov U.S.S.R., Fiz., No. 5, 123. 

Bess, L. (1973) Progr. Theoret. Phys., 49, 1889. 

Blokhintsev, D. I. (1940) /. Phys., 2, 71. 

Bohm, D. (1952) Phys. Rev., 85, 166. 

Bohm, D. and Vigier, J. P. (1954) Phys. Rev., 96, 208. 

Born, M. and Jordan, P. (1925) Z. Physik, 34, 858. 

Bopp, F. (1956) Ann. Inst. Henri Poincare, XV, 81. 

Cohen, L. (1966a) /. Math. Phys., 7, 781. 

Cohen, L. (1966b) The Philosophy of Science, 33, 317. 

Cohen, L. (1970) /. Math. Phys., 11, 3296. 

De-Broglie, L. (1927) /. Phys. Radium, 8, 225 . . 

De-Broglie, L. (1953) La Physique Quantique Restera-t-elle Indetermimstel ', Gauthier- 

De-Broglie, L. (1956) Une Interpretation Causale etNon Lineaire de la Mecanique Ondulatoire : la 

Theory de la Double Solution, Gauthier-Villarsj, Paris. 
De-Broglie, L. (1964) Thermodinamique de la Particule Isolde, Gauthier-ViUars, Paris. 
De-Broglie, L. and Andrade e Silva, J. L. (1971) La Reinterpretation de la Mecanique Ondulatoire, 

Dirac! p!a. M. (1958) The Principles of Quantum Mechanics, Oxford University Press, Oxford. 

Einstein^ A., Podolsky, B. and Rosen, N. (1935) Phys. Rev., 47, 777. 

Einstein, A. (1948) Dialectica, 11, 320. 

Fenyes, I. (1952) Z. Physik., 132, 81. 

Finkelstein, R. J., Fronsdal, C. and Kaus, P. (1956) Phys. Rev., 103, 157 1 . 

Glasko, V. B., Lerust, F., Terletsky, Ya. P. and Shushurin, S. F. (1958) Zh. Ekspenm. Teor. Fiz., 

USSR 35 452 
Gorshenko'v, V. N. and Kognkov, V. L. (1973) Izvest. Vusov U.S.S.R., Fiz No 7, 140 
Gorshenkov, V. N., Denisova, N. A., Kognkov, V. L. and Ryasanova, L. Z. (1973) Teor. Mat. Hz., 

U.S.S.R., 15, 288. 
Groenewold, H. J. (1935) Physica, 12, 405. 

Imre, K., Ozizmir, E., Rosenbaum, M. and Zweifel, P. F. (1967) J. Math. Phys., 8, 1097. 
Kerner, E. H. and Sutsliffe, W. G. (1970) J. Math. Phys., 11, 391. 
Klimantovich, Yu. L. and Silin, V. P. (1960) Uspe. Fiz. Nauk, U.S.S.R., 70, 247. 
Kuryshkin, V. V. (1968) Sb. Nauchn. Robot Aspirantov, Peoples' Friendship University, Moscow, 

No. 1, 243. 
Kuryshkin, V. V. (1969a) Isvest. Vusov U.S.S.R., Fiz., No. 4, 111. 
Kuryshkin, V. V. (1969b) Sb. Nauchn. Robot Aspirantov, Peoples' Friendship University, Moscow, 

No. 6, 198. 
Kuryshkin, V. V. (1971) Izvest. Vusov U.S.S.R., Fiz., No. 11, 103. 
Kuryshkin, V. V. (1972a) Compt. Rend., 274, Serie B, 1107. 
Kuryshkin, V. V. (1972b) Ann. Inst. Henri Poincare, XVII, 81. 
Kuryshkin, V. V. (1972c) Compt. Rend., 274, Serie B, 1163. 
Kuryshkin, V. V. (1973) Int. J. Theoret. Phys., 7, 45 1 . 
Kuryshkin, V. V. (1974) Teor. Fiz., Peoples' Friendship University, Moscow, 78. 

Kuryshkin 83 

Kuryshkin, V. V. and Zaparovany, Yu. I. (1974) Compt. Rend., 279, Serie B, 17. 

Margenau, H. and Hill, R. N. (1961) Progr. Theoret. Phys., 26, 722. 

Mehta, C. L. (1964) /. Math. Phys., 5, 677. 

Moyal, I. E. (1949) Proc. Cambridge Phil. Soc, 45, 99. 

Neumann, J. (1932) Mathematische Grundlagen der Quantenmechanik, Springer, Berlin. 

Pena-Auerbach, L., Cetto, A. M. and Brody, T. A. (1972) Letters alia Redazione, Nuovo Cimento, 

5, 177. 
Rivier, D. C. (1951) Phys. Rev., 83, 862. 
Ruggery, G. J. (1971) Progr. Theoret. Phys., 46, 1703. 
Rybakov, Yu. P. (1974) Foundations of Physics, 4, 149. 
Schrodinger, E. (1935) Naturwissenschaften, 23, 807, 823, 844. 
Schrodinger, E. (1955) Nuovo Cimento, 1, 5. 
Shankara, T. S. (1967) Progr. Theoret. Phys., 37, 1335. 
Shewell, J. R. (1959) Am. J. Phys., 27, 16. 
Temple, G. (1935a) Nature, 135, 957. 
Temple, G. (1935b) Nature, 136, 179. 

Terletsky, Ya. P. (1937) Zh. Eksperim. Teor. Fiz., U.S.S.R., 7, 1290. 
Terletsky, Ya. P. (I960) /. Phys. Radium, 21, 771. 

Tolman, R. S. (1938) The Principles of Statistical Mechanics, Clarendon Press, New York. 
Tyapkin, A. A. (1 968) Development of Statistical Interpretation of Quantum Mechanics by Means of 

the Joint Coordinate-Momentum Representation, U.S.S.R, Dubna. 
Weyl, H. (1950) The Theory of Groups and Quantum Mechanics, Clarendon Press, New York. 
Wigner, E. P. (1932) Phys. Rev., 40, 749. 
Yvon, J. (1948) Cahiers Phys., 33, 25. 

Zaparovany, Yu. I. (1974) Izvest. Vusov U.S.S.R., Fiz., No. 6, 18. 
Zaparovany, Yu. I. and Kuryshkin, V. V. (1975) The article is deposited in VINITI U.S.S.R., No. 

2353-75, Dep. 
Zaparovany, Yu. I., Kuryshkin, V. V. and Lyabis, I. A. (1975) Sovremen. Zadachi y tochnikh 

naukakh, Peoples' Friendship University, Moscow, No. 1, 89 and 94. 
Zlatev, I. S. (1974) Compt. Rend., 27, 311. 


Measurement Theory 

The Problem of Measurement in 
Quantum Mechanics 

Ludovico Lanz 

Instituto di Fisica dell'Universia, Milan, Italy 


Quantum mechanics is itself a statistical theory of measurements. It works very 
well for some measurements, for example wonderfully well in atomic spectros- 
copy. It may appear surprising that a theory of measurement should be basic to 
quantum mechanics since quantum mechanics is explained in most textbooks, 
as well as being applied and further developed, without particular reference to 
such a theory of measurement. The peculiarity of quantum mechanics is that 
one firstly has measurements and then subsequently one must worry about 
what has been measured. 
There are two very different attitudes towards quantum mechanics; namely: 

(1). Quantum mechanics is the fundamental theory of physics; any physical 
theory is essentially a theory of measurements. 

(2). Quantum mechanics is the fundamental theory of microsystems; there- 
fore the theory of microsystems is essentially a theory of measurements on 
microsystems by macrosystems. A primary objective of physics is therefore to 
describe the nature of macrosystems. To reach such an objective the physics of 
microsystems is an essential ingredient. 

The first attitude is the point of view one learns in textbooks of quantum 
mechanics, e.g. Dirac's fundamental book. There is an interpretation of 
physics, first expressed by J. von Neumann (1955) and favoured by Wigner 
(1971), in which the observer has a fundamental role: physics describes the 
observations of the observer and his impressions are the basic entities. Since 
observations are made by measuring apparatuses the following consistency 
problem arises. Any observable of a system must be equivalent to another 
observable of a second suitable system interacting with the first one, if the latter 
is to be interpreted as a measuring apparatus. Von Neumann gives a schematic 
solution of this problem of measurement. However the impressions of an 
observer are real but absolutely private entities. An observer cannot point out 
the impressions he has received from an observation, therefore such impres- 
sions are outside the realm of any science. On the contrary one needs objects as 


88 Uncertainty Principle and Foundations of Quantum Mechanics 

basic entities of physics. One can identify objective aspects in quantum 
mechanics: there are systems and sets of measurements on them which are 
dispersionless, i.e. the outcome of these measurements is certain. Then corres- 
ponding to each measurement of this kind one can attribute an objective 
property to the system. The set of such properties is the 'state' of a single 
system. Quantum mechanics has the following general feature. At any time a 
statistical collection of systems can be decomposed into subcollections such 
that to each system in each subcollection an objective 'state' can be attributed. 
One can identify in such properties the basic entities which are measured. 
Consistently with this point of view Jauch (1968, 1969) and Piron (1964) were 
able to obtain quantum mechanics as a consequence of simple axioms about 
yes-no experiments, properties and states. The 'state' of a macroscopic system 
should embody all the typical macroscopic properties known, for example, in 
the case of a measuring apparatus, a certain position of the pointer is a 
macroscopic property. 

Let A be an apparatus with a pointer. The pointer moves from A to A x when 
A interacts with a system which has a property p; the pointer does not move 
when the system has the property p* (p* not to have the property p). Let 
h A (h s ) be the Hilbert space of the apparatus (of the system); h p <=h s the 
subspace of h s associated with the property p,/i A c/i A the subspace associated 
with the position A of the pointer; H is the Hamiltonian of the joint system, t 
the duration of measurement. One must have 

c- iH 'P^ o ®Ple iH ' = P^®K e- iH 'P^®Plc iH ' = P^®P s r (1) 

for all <l/€h p ,ijfe h s Qh p , p* being the projection on i/r; <p A , <p'x e K. 

Consider now a case in which system S has neither property p, nor p*, but it 
has another property p' such that h p ' is not contained in h p nor in h Qh p . Let 
7j e h P ' be the state of S: 

■n^c^ + c 2 4>; </>eh p , j,eh s Qh p ; W = W=1 

One has by (1) 

e--"'/^®^"' = | Cl | 2 P£ A1 ®^. + |c 2 | 2 P: Ao ®P|. + Re c lC f(. . .) (2) 

This formula contains the whole problem of measurement in the first- 
mentioned attitude; if the last term were zero, no problem would arise. The 
collection of systems A+S can be separated into two subcollections, the first 
containing a fraction |ci| 2 of systems A+S each of which has part A with the 
pointer at A t , the second Subcollection containing a fraction \c 2 \ of systems 
A+S with part A having the pointer at A . This is in complete agreement with 
the physical meaning which in quantum mechanics is given to coefficients. c u c 2 
in t> = Ciiff +c 2 <Jf. However the third term in equation (2) is non-zero if c x and 
c 2 5* 0. It is the infamous interference term. The mathematical reason for its 
presence is the following: P* o ® P% is a pure state, i.e. an extreme element of 
the convex set of states. If time evolution is represented by linear, invertible 
mappings on the set of states, an extreme element is mapped into an extreme 

Lanz 89 

element; if the interference term is zero, the right-hand side of equation (2) is a 
mixture of two states and not an extreme state, which is impossible. Due to the 
interference term the pointer of A after the interaction no longer has a position. 
The notion of objectivity in quantum mechanics is too restrictive to give an 
account of the objective properties of part A of the system A+S, if A and S 

More generally if one considers a composite system Si + S 2 , due to the 
interaction, there are no properties of Sx + S 2 to be described as: Si has a 
property p y and S 2 has a property p 2 . As long as Si and S 2 are microsystems this 
can look strange but is not a basic difficulty: it is essentially the Einstein- 
Rosen-Podolsky paradox; if one of the components is macroscopic one meetsa 
big difficulty, as it has been particularly stressed by d'Espagnat (1971b). The 
difficulty lies in the objectivity criterion or in the time evolution law. It is 
difficult to see how to generalize the objectivity criterion; time evolution is a 
consequence of symmetry under time translations of an isolated system. Since 
the interference terms depends critically on small external perturbations, it 
may be that they are not really meaningful in the physically realizable condi- 
tions of isolation, as remarked by Zeh (1971). If one considers a system as a 
subsystem of a larger one, one has no strict condition about its time evolution. 
'Non-Hamiltonian' mappings, which are largely used in the theory of open 
systems, are admissible. Such mappings are not invertible and extreme states 
can be mapped into mixtures. The idea that any observed system should be 
considered as an open system has lead Everett (1957) to claim that no system 
can be isolated from the rest of the universe, in which the observer must also be 
included. However there are many examples of successful phenomenological 
theories for isolated macrosystems. The fact that in the quantum-mechanical 
description of macrosystems elements enter which are highly unstable, or 
foreign to the system, as Everett's wave function of the whole universe, is an 
indication that quantum mechanics does not work very well for macrosystems. 
Therefore, in my opinion, due to the difficulty in the problem of measurement, 
attitude (1) should be dismissed. 

The second-mentioned attitude, at least at a linguistic level, was that held by 
Bohr. He puts the objective character of apparatuses in the foreground and 
perhaps misleadingly, describes them as 'classical systems'. This was often 
interpreted in the sense that there are systems, to be used as measuring 
apparatuses, for which pre-quantum physics is the right theory; since on the 
contrary quantum effects are well known in macrophysics, the concept of 
'classical systems' seems fictitious. Ludwig (1972) has recently formulated a 
new general theory of macrosystems in which their objective character is built 
in. He has also developed a theory for a composite system (Macrosystem+ 
microsystem). (Ludwig, 1966b). The problem of measurement is formally 
solved; it is shown that the registration of how a macroscopic apparatus is after 
the interaction with a microsystem is equivalent to the measurement of an 
observable of the microsystem. In principle such an observable can be calcu- 
lated in terms of mathematical elements referring to the description of the 

90 Uncertainty Principle and Foundations of Quantum Mechanics 

macroscopic apparatus. Not only a physical content of a statistical operator for 
a microsystem can be read off from measuring apparatus A, but also the initial 
statistical operator can be read off from the macro-source S of the microsystem. 
In conclusion quantum mechanics for a microsystem a produced by the source 
S and measured by the apparatus A, describes to a certain extent the interac- 
tion between S and A, the microsystem being the vehicle of such an interaction; 
in principle the macroscopic description of system S + A can completely replace 
the quantum mechanics of a. A final primacy of the objectivistic way of 
describing the world or classical way, is established, consistently with quantum 
mechanics of microsystems. Such a result does not mean that one has gained a 
classical insight into microsystems, but rather one has a classical 'outsight' from 
the microsystems; no classical hidden variables have been found, but a classical 
anchorage for the microsystems has been achieved. It may be that the search 
for such an anchorage, which is basic goal of a sane philosophy, was one of the 
motivations for looking for hidden variables. 

An important consequence of this point of view is that one has no need for 
objectivity criteria inside quantum mechanics of a microsystem. Properties and 
states for single microsystems are interesting but not basic features of the 
formalism. Peculiarities concerning properties of composed microsystems, as 
in the ERP paradox are no longer a difficulty. Quantum mechanics should not 
be based on properties and states of microsystems, but should be a theory for a 
certain class of experiments which, by means of an interaction between a source 
S and a measuring apparatus A, give evidence for a microsystem; this is just the 
starting point of Ludwig's axiomatic approach to quantum mechanics (Ludwig 
1970). At such a point most physicists would be disappointed since this theory 
apparently gives a secondary role to quantum mechanics. However let me 
stress that the theory of macrosystems proposed by Ludwig is nothing more 
than a formalization of statistics and 'state' space. Every type of time evolution 
deterministic or not, Markoffian or not can be placed into it. Obviously all 
known examples of theories for macrosystems fit into this scheme, which, 
however, does not help one to find such theories. The sole general and 
conclusive way to build a theory of a macrosystem is to rely on its atomistic 
structure and describe it by the mechanics of the microcomponents. So indeed 
microsystems entered into physics first as an hypothesis, then as objects that 
could be emitted and revealed by macrosystems. The basic role of quantum 
mechanics is to provide the concept of a particle and an insight into the 
interactions between the particles. Particles and their interactions are then the 
starting point for the atomistic theory of macrosystems, which should finally fill 
up the formal scheme proposed by Ludwig. The actual theory of atomic 
structure is quantum statistical mechanics, in brief N-body quantum theory. 
Therefore we can state the problem of measurement in the following form. 
N-body quantum theory must yield the concrete input for Ludwig's theory of 
macrosystems. Quantum theory of an N-body system interacting with a 
microsystem must yield the concrete input for the new theory of a macrosystem 
interacting with a microsystem. The important point is the following: we need 

Lanz 91 

not rely on an objectivity criterion in quantum mechanics and pretend that 
macroscopicity is a 'property' (in the technical sense) of the N-body structure, 
which leads to the difficulty with the interference term in equation (2). One 
must simply extract from the N-body theory in a sufficiently general and precise 
way what is relevant as an input for the new theory of macrosystems and throw 
away the rest as physically irrelevant. This has not yet been done in a 
satisfactory way. 

Anyway the problem of measurement in quantum mechanics is not a 
philosophical quarrel about the interpretation of the world, nor a basic 
difficulty of quantum mechanics as in equation (2), but a technical problem in 
N-body theory. This important conclusion has been reached long before the 
recent theory of a macrosystem, by Ludwig (1953) and by Daneri, Loinger and 
Prosperi (1962), who identified in the ergodic behaviour of many-particle 
systems the technically relevant point. Since these early attempts other main 
approaches and ideas in quantum statistics such as the master equation theory 
(Lanz and coworkers, 1971), the independent subdynamics theory (George 
and coworkers, 1972), C* algebra formalism (Hepp, 1972), have been con- 
fronted with the problem of measurement; all these attempts suffer from the 
lack of a clear and general mathematical characterization of macroscopicity. A 
general very readable survey on the way from microphysics to macrophysics is 
given by Caldirola (1974). The somewhat utilitarian exploitation of N-body 
theory by which the problem of measurement should be solved, is justified if 
one takes into account that N-body theory is a formal extrapolation of the 
quantum mechanics of microsystems to the case of systems with an extremely 
large number of particles, in which the quite unobservable correlations 
between all the particles are described. Let us suppose we have on the one side 
the quantum mechanics for microsystems, on the other Ludwig's scheme for 
macrosystems filled with the aid of N-body quantum theory then one can then 
hope that a new unified theory can be revealed, ranging from microsystems to 
macrosystems, which could perhaps cover also intermediate systems such as 
macromolecules (Ludwig, 1972a,b). 

I shall discuss the problem of measurement having the second attitude 
mentioned above. In Section 2 quantum mechanics as a theory of measure- 
ments on microsystems will be discussed, with particular reference to the ERP 
paradox, in Section 3 a sketchy, but I hope not too distorted, account of 
Ludwig's theory of macrosystems will be given. Finally in Section 4, the 
problem of measurement within N-body theory will be stated in a precise way. 


There are interactions between two macrosystems which can be explained in 
terms of a microsystem which the first macrosystem emits and by which the 
second one is affected. We shall call the first macrosystem the source or 

92 Uncertainty Principle and Foundations of Quantum Mechanics 

preparation part. The typical feature of such an interaction is that it causes a 
perturbation which spreads out from one or more pointlike regions inside the 
affected part. In a space-time description of the affected part these perturba- 
tions involve space-time points inside one or more cones with axes parallel to 
the future time axis. The occurrence of such perturbations is very stochastic, 
i.e. repetition of the same experiment under the same conditions of preparation 
and of affected parts yields a different pattern of the afore-mentioned point- 
like regions. The single experiments are not reproducible and it would be 
meaningless to formulate a theory for them. If a single experiment is repeated n 
times, the frequencies of occurrence of a certain type of perturbation can be 
measured; if n is large enough such frequencies are reproducible and it is worth 
while formulating a theory for them. In fact such frequencies have in a certain 
sense a universal character. They are completely independent from very many 
features of the preparation and the affected part. Such a situation is an obvious 
consequence of the atomistic structure of matter: microsystems effect only 
some atoms of the affected part by interactions that have a universal character. 
We shall mean by an experiment on a microsystem a statistical collection of a 
large number of single experiments in each of which a single microsystem is 
emitted and revealed. It consists in principle in the repetition of the following 
steps, (a) production of the microsystem by a source which has been prepared 
by taking into account a certain set a of macroscopic prescriptions, the same 
procedure to be taken in the n repetitions, referred to a frame R; (b) 
observation of whether a measuring apparatus, which has been prepared taking 
a certain set /3 of macroscopic prescriptions into account, is affected or not in a 
certain prescribed way y (also fi, y are referred to R and are fixed in the n 
repetitions) (c) counting how many times n + the apparatus is affected. The 
frequency n + /n is the quantitative result of the experiment. Let us call (R, a) a 
preparation procedure and (fi, y) a measuring procedure. 

Ludwig has obtained the axiomatic structure of quantum mechanics, in a 
generalized form, starting from suitable axioms about a physical input consist- 
ing of preparation procedures, measuring procedures and frequencies (Lud- 
wig, 1970). Let me summarize the result in the particular case of no superselec- 
tion rules. The set M of experiments which give evidence for a microsystem is 
described by means of a Hilbert space h M with the following interpretation. 
Each statistical operator W on h M represents a class of equivalent preparation 
procedures, where two preparation procedures are equivalent if one has the 
same frequencies for any measuring procedure. Each operator F on h M , such 
that 0<F<I, represents a class of equivalent measuring procedures, where 
two measuring procedures are equivalent if for any preparation procedure one 
has the same frequency. 

The operators F are called 'effects' by Ludwig; in the conventional axioma- 
tics of quantum mechanics measuring procedures are associated only with 
projections on h M . The frequency of the effect F, when the preparation W is 
made, is given by Tt(FW). The symmetry of the theory under time-translations 
implies that a semigroup V(t), t > 0, exists of mappings of the set L M of effects 

Lanz 93 

into L M , such that V{t)I = I, I being the identity operator. V(t) is contractive 
on U(h M ), the Banach space of linear bounded operators on h M (Comi and 
coworkers, 1975). In the standard axiomatics of quantum mechanics one 
requires that Y{t) can be extended to a group; then projection operators are 
mapped into projection operators and V(t) must have the following structure: 

Y{t)F= V(t)FVUt) 

V(t) being a unitary group on h M • Let iH be the generator of V(t ), then H is the 
hamiltonian of the system. 

Let V'(t) be the adjoint operator of V(t); V'(t) maps the set K(h M ) of all 
statistical operators on h M into itself. V(t) is a contractive operator on r C(h M ), 
where tC(/im) is the Banach space of 'trace class' operators on h M . Due to the 
definition of V'(t) one has: 

Tr(( V(t)F) W) = Tr(FV (t) W) 

Let us consider a preparation procedure W=(a,R) and the transformed 
preparation V(t)W; T'(t) W is the preparation procedure W shifted back in 
time by a time interval of length t. The relation between W and V'(t ) W can be 
described as follows: f(t) W consists of the preparation W and of waiting a 
time t after W is finished. In the time interval [0, t] only the microsystem 
evolves, therefore V'(t) can be considered as the time evolution operator of the 

Space-time symmetry has further consequences: in the non-relativistic case 
(for simplicity we consider only this case) one has on h M a unitary, projective 
representation of the group G of space translations, accelerations and rota- 
tions. Let us assume that such a representation is irreducible and that the 
theory describes at least the effects linked to the most simple perturbations of 
the effect parts; such most simple perturbations are those spreading out from 
one point-like region. The representations of Go which satisfy these require- 
ments are characterized by two indexes, a half-integer numbers s and a positive 
number m, to be interpreted as spin and mass of an elementary particle. By 
suitable choices of the parameters s and m certain sets M of experiments can be 
described: the corresponding microsystems are the most simple ones, they are 
'one particle' systems. 

Let us consider two experiments concerning two particles I and II and let us 
build a correlated experiment, in which the two preparation procedures and the 
two measuring procedures are performed together. 

Such correlated experiments can obviously be described by preparations 

and by effects 

W i ®W ii £X 1 ®K i, c t C(/i i )(8tC(/i 11 ) 

F I ®F n 6jL I ®L n «=B(A I )®B(fc 11 ) 

94 Uncertainty Principle and Foundations of Quantum Mechanics 


TC(h I )®rC(h I1 ) = rC(h I ®h n ) 


B(h I )®B(h lI ) = B(h l ®h 11 ) 

W^W 11 and F l ®F 11 are very particular statistical operators and affect 
the Hilbert space h l ®h 11 . The axiomatic structure of quantum mechanics 
leads us to assume the existence of a microsystem to be associated with the 
Hilbert space h 1 ®^ 1 , i.e. each WeKih 1 ® h l \ FeL(h l ®h 11 ) should in 
principle correspond to a preparation and a measuring procedure. On h ® h 
one can place a unitary representation of G x G and at least effects linked 
with two-point perturbations of the affected part can be described. 

Then symmetry allows the following structure for V(t); V(t) = e' ', H = 
// I ®I I1 +I I ®iJ I1 +f/ int , H iat describing the interaction between the two 
particles. Microsystems of this type with a statistical operator such that 
interaction plays a role, so that H int can be tested, are prepared in all scattering 
experiments and are emitted from macrosystems in many spontaneous decay 
processes. In conclusion if I and II are particles, a microsystem (I, II) also exists 
with Hilbert-space /t 1 ® /i" and one can explain experiments about (I, II) by a 
suitable choice of H iat , at least in the non-relativistic case. Let us investigate the 
structure of microsystem (I, II) and its relation with particles I and II. Consider 
effects of the form F l ® I 11 and I 1 ® F 11 , one has 


Tr (F 1 ®I n W) = TtAF 1 W l ), Tr (7 1 ®F U W) = Tr„"(F n W 11 ) 
W l = Tr„<W0 eK\h l ) and W n = Tw(W)eK u (h u ) 

for physically meaningful /-f nt one has that for t large enough, t > i, T{t) W can 
be replaced with V\t) ® ¥' l \t)if; if being a 'collision' mapping of K onto K. 
Then f or t > i one has 

Tr„w (F 1 ® I 1 V (f ) WO = Tr„< (FV *(0 W 1 ), W 1 = Tr„" iff W) 

and similarly if 1*511. The interpretation of these results is straightforward: we 
have two particles I and II, which for r > i, no longer interact and are described 
by the initial statistical operators W 1 and W . 

We can look for the occurrence of the joint effect F 1 ® /" and I 1 ® F u , to be 
represented by F 1 ® F n , which has a frequency Tr (F 1 ® F n WO. In conclusion: 
the microsystem (I, II) is a pair of particles I, II and a correlation law for joint 
measurements. The problem of the description of a microsystem (I, II) has been - 
raised by the well-known ERP paradox (Einstein and coworkers, 1935). To 
make the discussion more specific let us consider the following example given 
by Bohm (1951): two spin \ particles emitted in a singlet state by a suitable 
source move in opposite directions and the components of the spins S t and S 2 
in two directions n a , n 2 are measured, e.g. by means of two Stern-Gerlach 

Lanz 95 

magnets followed by two revelators. If u± n are the normalized eigenvectors of 
S l z n corresponding to eigenvalues ±\, one has, neglecting for simplicity the 
spatial coordinates: 

I 1 

W = F^, ^ = 4-(w+®"--"-®"")eC 2 ®C 2 I , W l = 

\2 2 

W n = 

I u 

Pl®P"„ Pl®^ 



Tr(P l B ®P 1 *P ll ,) = 0, 

Tr(Pi®PL^) = i 

Let P„ be the eigenprojection of S n = S„ corresponding to the eigenvalue \; for 
the effects 

P\®I l \ /'®P", P\®P l L 
one has the probabilities 

Tr [P\® / n P ) = Tr {I l ®P^) = 

Tr(Pi®P II 11 P^ = i 
Tr(Pi®PL I y P <A ) = i 

The microsystem consists of two particles with isotropic spin statistics; it has the 
following property: total spin = 0; no spin property can be attributed to 
particles I and II. The simple correlation law can be anticipated from rotational 
symmetry. Take a collection of microsystems (I, II) with preparation P^, put an 
apparatus for measuring S„ in the path of one of the two particles. Theoretically 
the frequency of S„ = i (~j) is \. Then put an apparatus for measuring S n - on 
the path of the second particle and again measure the frequency S n = \, (-|). 
The two sets of apparatus can coexist since 

[Pn®I u , ?®Pn] = 

We have the same effect as before 

Pi ® I" = P\ ® P" + Pi ® P" 

and the same result as before. Apparatus II does not influence in any way the 
physics of particle I. One has the identity 

Tr (Pi ® I U P*) = Tr (P l B ® I a W m .) 


W m - = 5 W„. + kW m ; W n . = 2P\- ® P"P*Pi ® P" , Vn' 

We see that as far as particle I is concerned, we can describe the collection as 
built from two equal subcollections of microsystems with the property S n - = 2 
for the first one and the property 5- D . = \ for the second one; this for any 
direction n', obviously without any consequence for particle I. Let us use 
measurement II to build a new statistical collection: we select those microsys- 
tems for which the apparatus II has yielded the result S" = 2 , (-5) and look for 
the effects produced by the microsystems of this collection. The statistical 
operator for this new collection is W n ; ( W- n ) and a single microsystem of it has 
the properties, S" = \, (S"„' = \). If Si'v& measured the probability of the result 

96 Uncertainty Principle and Foundations of Quantum Mechanics 

Si = 5 depends on n-n', e.g. in the case n = z n' = x(y) it is \ in agreement with 
the probability 1/4 of S\, S" (y) for the initial collection. The transition W-> W a - 
is not a consequence of the interaction of particle II with the apparatus II, but is 
a consequence of the repreparation procedure in which measurement II is 
used. We stress that such a repreparation cannot coexist with another 
apparatus that measures S"», nV n'; therefore the two decompositions of W 
according to the measurements of s£, S" cannot be made together. There is 
nothing peculiar or paradoxical in this description. However it is possible that 
one would like to consider the microsystem (I, II) as a system of two correlated 
particles I and II in which case difficulties would arise. The correlation should 
be a correlation between properties of the two particles. Unfortunately due to 
the interaction, W l , W n are not pure states even if W is a pure state, this means 
that if a property can be attributed to a microsystem (I, II), such a property is 
not expressible as a property of particles I and II. Thus one has nothing to 
correlate and no vector state can be associated to the components I and II of 
microsystem (I, II); d'Espagnat describes this peculiarity as 'non-separability' 
of (I, II) into I and II (d'Espagnat, 197 la). In my opinion this indicated that one 
should not ascribe a basic role to the concepts of property and of state for a 
single microsystem, as I have already stressed in Section 1 where attitudes (1) 
and (2) were discussed. If attitude (1) is chosen one can assume that not the 
whole set Kih 1 ®^ 1 ) is physically meaningful, but only statistical operators 
with the following structure: 

where P) ® Pf are projections on states of the form 

Statistical operators of this kind are called by d'Espagnat mixtures of the first 
kind, while the other mixtures are called of the second kind. This can be looked 
upon as a mixture of pure states u)®uf in each of which a property / of 
particle I is correlated to a property / of particle II. Therefore in such a case 
microsystem (I, II) could be considered as a system of two correlated particles. 
However any state P$ with 

il, = au\®v" + Pu2<S)V2 

must be excluded; if I and II are two identical fermions, usual quantum 
mechanics claims that all pure states have this structure. It seems difficult to 
eliminate such states which provide the energy levels for atoms in excellent 
agreement with experimental results. However one could expect that scatter- 
ing states, with well separated particles should be described as two correlated 
particles, e.g. in the example we considered before, one could expect that the 
singlet state transforms into a mixture of the kind (Janch, 1971) 

W = j^\ P\,®Pn dn 

Lanz 97 

A behaviour of this type has been discussed by Bohm and Aharonov (1957). 
Then V(t) must transform pure states into mixtures. Usually one assumes that 
V(t) is a group, this excludes the afore-mentioned behaviour. However, in the 
framework of Ludwig's axiomatics, as it is shown by Comi and coworkers 
(1975), the assumption that T'(t) is a group, seems to be unnecessarily 
restrictive. V(t) can be a semigroup of linear mappings of K into K. Then the 
required behaviour can arise at least asymptotically (Barchielli and Lanz, 

A very important point is that measurable correlations are different for 
separable and for non-separable microsystems (I, II). Such differences are the 
same as those which discriminate between the existence or non-existence of 
local hidden variables. In fact a violation of Bell's inequality would prove the 
existence of non-separable microsystems. (Selleri, 1971; Kasday, 1971). Let 
me comment briefly on the relation of quantum mechanics to hidden variables 
theory. We started the discussion of experiments about microsystems observ- 
ing that for practical reasons only a statistical theory is needed, since single 
experiments are not reproducible. However one cannot exclude that a more 
fundamental theory exists which could be applied to very hypothetical, perhaps 
non-realizable, preparations, which are so accurate that all effects are repro- 
ducible, i.e. for each effect theory tells us whether it occurs or not. 

Since the basic concept in macrophysics is the concept of state space, as will 
be shown in section 3, it is appealing to associate with a microsystem a 'hidden' 
state space Z*\ which we assume to be a measure space, with a suitable set K of 
measures on a suitable <r-algebra of subsets of Z". Each preparation part 
prepares a microsystem to be represented by an element of Z. To a measuring 
part one associates a ^-measurable function tj(z) which assumes the value 
unity if the measuring perturbation occurs or zero if it does not occur. In a 
statistical experiment one can assume that a measure (i w € K corresponds to a 
preparation procedure and an 'average' function fj(z) corresponds to a 
measuring procedure, with 0^t}(z)^1 and rj(z) J^-measurable; then the 
probability of an effect F after a preparation W would be 

rj F (z)dfi w (z) 

The very hypothesis of the existence of a 'hidden' state space Z M for a 
microsystem does not contradict the basic axioms of quantum mechanics, if the 
latter is intended as the statistical theory of a certain class of interactions 
between two macrosystems; this is not so obvious if quantum mechanics is the 
theory of properties of a microsystem. 

A famous negative theorem about the existence of 'hidden' variables has 
been given by von Neumann (1955) and in a more sophisticated way by Jauch 
and Piron (1963). The physical relevance of these negative theorems has been 
criticized in an important paper by Bell (1966). While existence or not of 
hidden variables has little to do with quantum mechanics, the properties of the 
function rj F (z) which refers to a measuring procedure and not to a measuring 

98 Uncertainty Principle and Foundations of Quantum Mechanics 

part must be confronted with quantum mechanics. Bell considers the class of 
local hidden variables. Let me translate Bell's definition of local hidden 
variables into the language of the present discussion. Consider the effect 
F=(F l F 2 ) = (Fi andF 2 ) where F u F 2 refer to spatially well separated effects, 
then the locality condition is iW 2 )(*) = Vf^v^z). It is just such a require- 
ment which makes hidden variables useful to describe a microsystem (I, II) and 
to explain the ERP paradox. Let us consider effects such that the pairs Fu F 2 ; 
Fi, F' 2 ;Fu F 2 ; F[, F 2 are coexistent and satisfy the locality condition. Typically 
Fi'Fi'(F 2 ,F 2 ) can correspond to two different orientations of the same 
apparatus,'e.g. the symmetry axis of a photon linear polarization analyser can 
be oriented in different directions. Then for any preparation, the probabilities 
P (Fl , F2) , P (F „ Fi ), P&M, P(Fi^) satisfy the inequality (Bell, 1971). 
|P(F 1 ,F 2 )-F(F 1 ,F 2 )| + F(F' 1 ,F 2 )+F(Fi,F 2 )^2 

In the mathematical theory of microsystems (I, II) provided by quantum 
mechanics there are effects and preparations (having the feature of 'non- 
separability') which do not satisfy Bell's inequality. Therefore local hidden 
variables do not complete quantum mechanics, but contradict certain of its 
statistical predictions. It is a very important experimental problem to gain any 
evidence of a violation of Bell's inequality; this would rule out local hidden 
variables and indicate the reality of 'non-separable' microsystems. Recently 
interesting results on the two-photon system have been obtained by Kasday 
(1971) and by Clauser and Freedman (1972), which indicate a violation of 
Bell's inequality. 


A macrosystem is such that at any time one can say 'how it is'; physics is 
supposed to give a mathematical description of how 'a macrosystem' is. In place 
of the phrase 'how it is' let us speak of the 'state' of the macrosystem at time t 
and represent such a state by an element z(t) of a suitable space Z. More 
precisely, let us consider a macrosystem which for times f>0 is isolated; 
Ludwig postulates that its objective qualities at any time t > are represented 
by a point z{t) in a state space Z Such a description of a macrosystem can be 
called, realistic, objectivistic or somewhat misleadingly 'classical'. Classical 
refers to the fact that it was the sole attitude of physicists before the develop- 
ment of quantum mechanics, but does not mean at all that one pictures a 
macrosystem as an assembly of molecules described by classical mechanics or 
that one derives all electromagnetic phenomena from the Maxwell equation. 
To make clearer what is meant, consider a black body at equilibrium: its state 
can be specified by a description of its walls, the temperature T and the 
distribution U{v) of electromagnetic energy density on the eigenfrequencies of 
the electromagnetic field; neglecting all aspects of the walls except the volume 
V of the hollow part, one has z = ( V, T, {u (v)}, zeZ; the average value of the 
variable U(v) is the well-known Planck radiation law, in which the 'quantum 

Lanz 99 

theoretical' constant h appears. All statements about a macrosystem finally 
refer to a suitable space Z, which depends on the kind of system and on the level 
of the description. Examples of this are as follows: macrosystems schematized 
by a set of mass points, having at any time a position x(l)eR and a momentum 
p,-(f) e R 3 , j = 1, 2 . . . k, which can be represented by an element of R ; a fluid 
in local equilibrium inside a region ft <= R 3 , is described in hydrodynamics by a 
mass density function p (x) e 5£ x (ft) by an internal energy density u (x) e <£ (ft) 
and by a velocity field i)(x)eiT(ft); then the fluid can be represented by an 
element of .2 ,1 (ft)xi? 1 (ft)xi? 00 (ft). A dilute gas in ft is almost completely 
described by the Boltzmann distribution function f(\, p) e 5£ (ft x R ). 

Fortunately enough, in many cases fluctuations of the state are very small so 
that statistics can be forgotten, the difficult problem of defining measures in a 
function space can be avoided and actual states identified with average states. 

By many examples, e.g. the afore-mentioned ones, one is lead to assume that 
Z is a complete, metric space. The space 

r=C(@ + ,Z), + = (O,+oo), 

of all continuous functions z{t), t>0,z(t)e Z, is called trajectory space. On Y 
one can define in well-known way a topology, by which Y becomes a metric, 
complete space. However the corresponding metric d(y, y') has not a direct 
physical meaning. If two trajectories y, y' are physically appreciated to be in a 
certain vicinity this is not well represented by a condition such as d(y, y')<C 
the latter criterion being too restrictive. Ludwig shows that a new metric can be 
defined, leading to physically meaningful vicinities, which induces in Y the 
same topology (but a coarser uniform structure) as the topology C c (@+, Z) of 
the uniform convergence on compact subsets of ©+ ; Y with such a new metric is 
not complete. Its completion Y is a compact Haussdorf space; on Y the set of 
all continuous functions z (t ), t > is dense. On Y a continuous time translation 
operator T(r) can be defined for T>0:T(r)y = y' where y = z(f)<^y' = 
z(t + r),\/yeY. 

Let us consider the Borel cr-algebra 38 (Y) and the set of all signed Radon 
measures on S3 (Y). To this set a Banach space structure can be given, it 
coincides with C'(Y), which is the dual space of the Banach space of continuous 
function on Y. 

A preparation procedure of a statistical collection of macrosystems is 
represented by a positive, normalized Radon measure u(<o) on 53 (Y). Once 
such a measure is explicitly given the whole statistical dynamics of a macrosys- 
tem of the prepared collection is known. In fact u(<o) is the probability that the 
trajectory of the macrosystem belongs to w <= Y The whole physics of a 
macrosystem of a given type (i.e. describable in a given space Z) is known if the 
convex set of all possible preparations K m cC(Y) is known. Suppose K m Is 
given, let us see how the physics of the macrosystem can be gained. Consider 
the weak closure K m of K m in C'( Y) ; K m is convex and compact in the a{C'{Y), 
C(Y)) topology; the extreme points u t of K m are 'elementary preparations', 

100 Uncertainty Principle and Foundations of Quantum Mechanics 

each preparation being a mixture of them. Then consider «, g K m . The support 
of !!•(») is the set of functions z(t) and also of limit points of Y in Y, which are 
possible trajectories for the macrosystem; if the support of u, reduces to a point 
yi eY one has a deterministic dynamics, Y, being the trajectory of the 
macrosystem. If one assumes that the cylindrical sets 

aV = {y:*(')ei?,i|e»(Z),f>0} 

are «, measurable, «,(«*,) is the probability that the state of a macrosystem of 
the prepared collection belongs at time t to rj c Z. Obviously in the determinis- 
tic case 

For any r > 0, u e K m , «KJ is a positive, normalized measure on 38 (Z). If for 
any u u u 2 eK m the equality 

Ui(o v ,t) = M2(«„,«), Vtj g »(Z) 
« <e, e arbitrary >0 implies u x = u 2 ,~the theory is Markoffian. More general 
cylindrical sets can be considered 

-, lh . 1I- ,_^-{y:*ft)e*i-i,2...*,ii,6a(z);*>o} 

then u.-K,,^...,**) provides a full description of the time correlations. 
Symmetry under time translations implies that if u e A m also u e A m , u \a>) - 
«(77 1 o)), t > 0; the mapping u -* u, defines a semigroup Y\t) of endomorph- 
isms of C(Y), u = T(t)u, which maps K m into K m . The great advantage of this 
formulation is as follows: no assumption about the dynamics of a macrosystem 
enters into the mathematical structure of the theory which, however, is precise 
enough to solve formally the problem of measurement in quantum mechanics. 
The unusual concept of trajectory space can be avoided at the price of the 
following, perhaps wrong, assumption about the dynamics of a macrosystem: 
by a suitable choice of Z the dynamics of a macrosystem is Markoffian. 

Classical mechanics in phase space, hydrodynamics, the Boltzmann descrip- 
tion of a gas are examples of Markoffian theories. In this case the preparations 
of a collection of macrosystems are represented by a set K t of positive 
normalized measures on S8(Z), Z being a suitable compactification of Z, and a 
semigroup V^t) of endomorphisms of C(Z) exists which maps K± into K±. If 
u g K± is a statistical preparation of a macrosystem (V(t)u)(ri), y e S8(Z) is the 
probability that the state z of the macrosystem belongs at time tto a set tj <= Z. 
For a g 38 (Z) let us define the linear operator \<r ° n C '(Z) as 

Or»(tl) = «(o-riT|) 




is the probability measure of the cylindrical set of Y:{y :z(f,)Gi7,-, i- 
1, 2, . . . k, tj, g 38 (z)}. Let us consider a preparation procedure W g X(/t ) of a 

Lanz 101 

collection of microsystems and a preparation u n g X m <= C'( Y) of a collection II 
of macrosystems and let us correlate effects of the microsystem with observa- 
tions of the trajectories of the macrosystem. Then the preparation of the 
composite system can be described by a positive normalized measure defined 
on Y with values in rCih 1 ): 

H I ' u (a»)=W I HV), <ogS8(Y) 

the probability that the microsystem produces the effect F and the trajectory of 
the macrosystem belonging to a set <o c Y is given by 

Tr ( W'FVV) = Tr (« I ' II («)F I ) 

Therefore one is led to describe in general the preparation of the system: 
microsystem I + macrosystem II, by a suitable set X" 1 ' 11 of positive normalized 
measure u I,n (cj) on &(Y), with values in rC(h l ), where normalization means 
that Tr fc '(« u (Y)) = 1. Tr (FV U («)) is the probability that effect F 1 occurs and 
the trajectories of the macrosystem belong to the set a> g 38 ( Y). By symmetry 
under time translations a semigroup of affine applications V'(t) of K ' into 
JC 1 ' 11 must exist, representing a preparation consisting in preparing u and 
waiting a time t, i.e. the free evolution during a time t, after the preparation u. 

A measurement procedure on a microsystem with statistical operator W can 
be described in the following way. One has a system composed of an affected 
part, prepared with the preparation procedure u and of a microsystem 
prepared with the preparation procedure W l . After a time T chosen in such a 
way that the micro- and the macrosystem, have interacted, one looks at the 
trajectories of the macrosystem with no regard to the microsystem. The 
probability that the trajectories of the macrosystem belong to a set <u g 38 ( Y) is 
given by: 

p(w) = Tr {{Y{T)$WW l )((o)I l ) = Tr ((y\T)$WW l ){a>)) 

where ( W 1 • u lI )((o) = W l u(ca), $ being a suitable affine mapping of Kih 1 ) x K m 
into K 1 ' 11 . Since p(w) is an affine functional of W l on K{h l ) and Q<p(w)< 1, 
there exists a uniquely identifiable effect F I (w)eL(ft I ) such that p{<o) = 
Tr (FV) W 1 ). The set of effects F\w), <o e j% ( Y) is an 'effect'— valued measure 
on the o--algebra of Borel subsets of Y; it defines an observable of the 
microsystem. This notion of an observable is a straightforward generalization 
of the usual representation of a set of compatible observables, by a set of 
commuting self-adjoint operators A 1 ,A 2 ...A k ;ln fact one has the corres- 
pondence {A,, i = 1,2... k}+*P(E), Fe38(R*), P(E) being the common 
spectral measure of A u A 2 . . . A k such that 

l,= [ A, 


P{E) is a projection valued measure on the cr-algebra of Borel subsets of R . 
The generalization consists in replacing the projection valued measure with an 
effect valued one and R fc by Y. Let the effect part have a pointer whose position 

102 Uncertainty Principle and Foundations of Quantum Mechanics 

is x e R; then one can write with obvious notations 

z = (x,z'), <o(t ,E) = {y:x(t )eE} Ee<53(R) 

and consider the effects F l0 (E) = F(a>(t , E)). For fixed *„ and Be <I(R) one has 
an effect valued measure on R, if in particular the effects are idempotents, the 

A dFjA) 

J— OO 

would be an ordinary observable. Therefore we see that affected parts pre- 
pared by a preparation procedure u" and left to interact with a microsystem in a 
fixed time interval T, identifies an observable of the microsystem, which could 
be explicitly calculated if u", T(T) were explicitly known. 

Observables corresponding to different u are in general not compatible. It 
one assumes that, apart from superselection rules, all elements of L(h i ) : are 
effects, one has a deep, yet unexplored, link between the structure of Hubert 
spaces and superselection rules for microsystems, the structure of spaces Z for 
macrosystems and the interactions between micro- and macrosystems. In this 
treatment it has been assumed for simplicity that the microsystem is not 
absorbed by the macrosystem, i.e. the possible transition of system (I, II) to 
system II is not taken into account. Let us consider the microsystem after the 
interaction. The fact that after a suitable time i the interaction is negligible can 
be formalized by 

rWW'-ii = Y°{t)<p$W l -u n , t > i 
where T°(t ) is the no interaction time evolution mapping 
(T (t)u)M= e - iH MT; l <o)e mit 

and <p is an affine mapping of K l ' n into K u \ which describes the 'collision' of 
the microsystem with the macrosystem. 

The probability of an effect F with no regard to the macrosystem after the 
preparation W l -u 11 and its evolution in a time t, is given by 

Tr (e-%>(j? {W l -u n )(t))c iHl 'F l ) = Tr (W\t)F l ), 


W\t) = e- iH >'<pMW 1 -u n )(Y)e iH >' 


Consider any covering of Y by a numerable set of disjoint Borel sets «, of Y ; 
correspondingly one has the following decomposition 

t>i, W\t) = l Pi (t)W)(t) 


Lanz 103 

each component of such a decomposition is correlated to a certain set 77 «, of 
trajectories of the macrosystem. In such a way the measuring process is 
explained and an explicit link between F and the description of the affected 
parts has been obtained. It is also possible to link W with the preparation parts. 
Finally the probability of F with a preparation W, can be formally expressed as 
the probability that the trajectories of the composed system, preparation 
part + affected part, belong to a certain set of the trajectory space of such a 
composed system (Ludwig, 1972b). Notice that, in attitude (2) of Section 1, the 
question about the statistical operator of a microsystem after a measurement 
(by which it is not absorbed) cannot be solved within the axiomatics of quantum 
mechanics: one only knows that a statistical operator exists, which represents a 
preparation including the interaction with the apparatus. Concepts such as 
'measurements of the first type of a complete set of observables' are artificial 
ingredients by which simple exercises for students in quantum mechanics can 
be given. 

Let me remark that I have made statements as 'the probability that the 
trajectories of the macrosystem belong to a certain set <o c Y\ skipping for 
simplicity the problem of how such an objective fact can be ascertained. Such a 
point is treated in the theory of Ludwig, who formalizes the concept of 
'registration' of trajectories. A concrete registration procedure, e.g. a registra- 
tion by our senses, which registers certain trajectories and discriminates other 
ones, is always affected by an uncertainty in the registration of the trajectories 
which are at the boundary between the accepted and the rejected ones. To a 
registration procedure of trajectories a continuous function /(y) on Y corre- 
sponds such that 0</(y)< 1; /(y) is called a 'trajectory effect'. The set [0, 1] 
of C(Y) represents the set of all registration procedures. The probability of 
registration / for a macrosystem with preparation u is given by jjf/(y) du(y); 
i.e. by the value at/ of the functional which represents u in C\Y). Idealized 
registration procedures which accept or reject trajectories without uncertainty 
are represented by characteristic functions x<oiy) of the Borel subset of Y; in 
such cases one has 

**,(y)d«(y) = u(a>) 

J Y 

which is the result we have used. Analogous considerations hold for the 
composite system: macrosystem + microsystem. Trajectory effects have an 
important formal role: the whole theory can be put in a mathematical form 
which exhibits the same linear and order structures as quantum mechanics. 
Such formal resemblance could be relevant for the problem of connecting the 
axiomatic theory of macrosystems with TV-body quantum mechanics. 


The main tool for the description of macrosystems is TV-body quantum theory, 
which in many applications can be replaced by TV-body classical statistical 

104 Uncertainty Principle and Foundations of Quantum Mechanics 

mechanics. The practical success of this theory is very great and difficulties can 
be attributed to excessive technical difficulties. The general pattern by which 
such success is achieved can be described as follows. Let h be the Hilbert space 
of the N-body structure, H its Hamiltonian, L(h) the set of effects and K(h) the 
set of statistical operators. In correspondence to a space Z, typically a space of 
n-tuples of functions q>,{£), £ e R\ j = 1, 2 . . . n, one guesses for a set of fields 
£((), f e R k j = 1, 2 . . . n of self-adjoint operators in h and a set of statistical 
operators K. For WeK, 

is interpreted as the average value of q>j{£) at time t for the statistical collection 
described by W at t = 0. Sometimes also expressions 

Tr(e'%«>-'™-to(fl>.) 2 W) 

are calculated and interpreted as dispersions. An example of this procedure is 
as follows: Z = space of Boltzmann distribution V functions (distribution 
functions) for a gas 

£(x,p) = j* + (* + §>(*— §^d« 

4>(x) being the field operator in the second quantization formalism. In general 
such a procedure meets many purposes in macrophysics but does not yield the 
statistical distribution on the trajectory space that underlies such average 
values and dispersions. Using such a procedure one does not have a sufficient 
input for Ludwig's theory of macrosystems. The simplest way to provide such 
an input would be to find a measure F(a>) on SB ( Y) having values in the set L (h ) 
of effects of N-body theory, such that 

e Uft F(»)e-" ft = F(77 1 ft») '^0 

the last requirement arising since e"*... e' iH ' is the time translation mapping 
on L{h) and a is a subset of a trajectory space. Then u w (<») = Tr (F(a>) W), for 
all We K(h), would be an element of K m . The family F(w) is an observable by 
Ludwig's more general definition. It is the macroobservable of the JV-body 
system, which corresponds to the family of idealized effects xM in the theory 
of a macrosystem. One does not expect that such a strong solution of the 
problem exists. In fact, due to the macroscopic irreversibility of u w eK m it 
follows that u w T , where W T is W transformed by a time inversion, should no 
longer be sensible. Therefore one must give a 'weak* form to the previous 
requirement. A possible form could be: 

(ai). On S8(Y) a measure F(<o) with values in L(h) must exist and a set 
K <=■ K(h) can be found such that for all We K 

Tr (e iH 'F(w)e- iH 'W) = Tr (F(77^) W) 

u e -"« WB >»i(o) = u w (T, 1 <o), t>0 

Lanz 105 

The requirement that F(o) is a measure on 38 (Y) is mathematically very 
restrictive; physically it means that the apparatuses A M which, by a measure- 
ment on the Af-body structure, register the sets o> of trajectories, can all co- 
exist; since the apparatuses which measure a structure of ~10 23 particles are 
very hypothetical objects, the physical meaning of the previous statement is 
questionable. Therefore one is led to give a 'weak' formulation also for the 
measure character of F(w) and in place of (aO one requires: 

(a 2 )- On 38(Y) a function F(<o) with values in L(h) must exist and a set 
K <= K(h) can be found such that 

VWeK, u w (a>) = Ti(F(to)W) 

is a measure on &(Y), 

u e -'»' We ' H io>) = u w (TJ l (o), t > 

Since in our attitude N-body theory is only a provisional tool to find the right 
theory a less strong requirement is meaningful, such as the following one: 

(a 3 ). On 3&( Y) a family of functions F s {a>), 8 > with values in L(h) exists, 
and a family K s a K(h) such that for all 

W s e K s , lim Tr (F s (a>) W s ) 

exists f or all <a e 38 ( Y) and defines a measure u { w*}M on S8 ( Y) (no existence of 

lim F s ((o) or of lim W s is required) further 
«-»o a-*o 

U{e- tH 'W 3 e iH, }M = U {W »)(TJ l <o), t > 

Analogous considerations can be made if one ^assumes that the mac- 
rodynamics are Markoffian; then essentially one has Z in place of Y and this is 
a simplification, but the second part of a becomes the following: A semigroup 
V'(t) exists on K m such that 

U'(t)u w = u e "«- We '«; t>0, 17 e & (Z) 

and this is a complication. If assumption (aj) is further restricted by the 
requirement that F(w) is idempotent it leads to the well-known problem of 
the macroobservables as self-adjoint commuting operators on h. Such a 
problem has no solution of appreciable generality and indeed it seems-to be too 
naive a formulation of the problem of macroscopicity. Anyway since there is 
not a logically compelling reason to assume that dynamics in state space is 
Markoffian the usual master equation approach seems to be not completely 
appropriate for solving the problem. 

The main step to obtain F(<u), <oe 38 (Y) is to build F(<w) for <u being 
cylindrical sets (17^1, ri 2 t 2 , ■ ■ • T7 fc r fc )<= Y, f,>0, tj ( 6 S3(Z) which is already 
sufficient for most physical applications. It is also the sufficient input for the 
rather technical problem to prove the existence of the function F(a>), w e 
53 (Y), which assumes assigned values on the cylindrical sets. A final assump- 
tion on F(a>) and on K which possibly has some implication for the allowed 
interactions between a microsystem and a macrosystem is as follows: 

106 Uncertainty Principle and Foundations of Quantum Mechanics 

(fi). For each microsystem S with Hilbert space h s the following family of 
elements of K(h s ) 

«wW<») = Tr„(/ S ® F(a>)e- iH 'W s ® We'"'), t > 
for all 

and for all 


is still a measure on 38(f); H is the Hamiltonian of the system: N-body 
structure + microsystem S; more generally according to (a 3 ) one could substi- 
tute the right-hand side of this equation by 

(fi 3 ) 

limTr,,(I s ®F 5 ( < y)e _ 

r 'W s ®W s e iH ') 

If F(a>) and K were known, u^wjM would be the required input for 
Ludwig's formal scheme and the problems of measurement would be solved. In 
conclusion the difficulty with measurement in quantum mechanics has been 
shifted from formula (2) to the problem of building F(w ) and identifying K such 
that (a 3 ) and (/3 3 ) hold. The limit 8 -» in (o 3 ), (fi 3 ) should introduce macro- 
scopicity as a limit situation of N-body theory. One can hope that this is only a 
technical difficulty. 


The treatment in this section is based on a paper I am preparing with Dr. G. C. 
Lupieri. I wish to thank Dr. Lupieri for useful discussions about this subject. 


Barchielli, A. and Lanz, L. (1975) 'Non Hamiltonian description of two particle systems,' 

I.F.U.M., 181, F. J. Milano. 
Bell, J. S. (1966) Rev. Mod. Phys., 38, 447. 
Bell, J. S. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi 

International Summer School, B. d'Espagnat, Ed., Academic Press, New York. 
Bohm, D. (1951) Quantum Theory, Prentice Hall, New Jersey. 
Bohm, D. and Aharonov, Y. (1957) Phys. Rev., 108, 1070. 
Caldirola, P. (1974) Dalla Microfisica alia Macrofisica, Mondadori; Milano. 
Comi, M, Lanz, L., Lugiato, L. A. and Ramella, G. (1975) /. Math. Phys. 16, 910 (1975). 
Daneri, A., Loinger, A. and Prosperi, G. M. (1962) Nuclear Phys., 33, 297. 
Einstein, A., Podolsky, B. and Rosen, N. (1935) Phys. Rev., 47, 777. 

d'Espagnat, B. (1971a) Conceptual Foundations of Quantum Mechanics, Benjamin, New York. 
d'Espagnat, B. (1971b) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi 

International Summer School, B. d'Espagnat, Ed., Academic Press, New York. 
Everett III, H. (1957) Rev. Mod. Phys., 29, 454. 
Freedman, S. J. and Clauser, J. F. (1972) Phys. Rev. Letters, 28, 938. 
George, G., Prigogine, I. and Rosenfeld, L. (1972) Dansk. Mat. Fys. Medd., 38. 
Hepp, K. (1972) Helv. Phys. Acta, 45, 234. 


Lanz 107 

Jauch, J. M. (1968) Foundations of Quantum Physics, Addison Wesley, Reading, Mass. 

Jauch, J. M. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi 

International Summer School, B. d'Espagnat, Ed., Academic Press, New York. 
Jauch, J. M. and Piron, C. (1963) Helv. Phys. Acta, 36, 827. 
Jauch, J. M. and Piron, C. (1969) Helv. Phys. Acta., 42, 842. 
Kasday, L. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi 

International Summer School, B. d'Espagnat, Ed. Academic Press, New York. 
Lang, L., Prosperi, G. M. and Sabbadini, A. (1971) Nuovo Cimento, 2 B, 184. 
Ludwig, G. (1953), Z. Phys., 135, 483. 

Ludwig, G. (1970) Lecture Notes in Physics, 4, Springer, Berlin. 
Ludwig, G. (1973a) Lecture Notes in Physics 29, Springer, Berlin and 'Makroskopische Systeme 

und Quantenmechanik', Notes Math. Phys. Marburg (1972). 
Ludwig, G. (1973b) Lecture Notes in Physics 29, Springer, Berlin and 'Mess-und 

Praparierprozesse', Notes Math. Phys. Marburg (1972). 
von Neumann, J. (1955) Mathematical Foundations of Quantum Mechanics, Princeton University 

Press, Princeton. 
Piron, C. (1964) Helv. Phys. Acta. 37, 439. 
Prosperi, G. M. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi 

International Summer School, B. d'Espagnat, Ed., Academic Press, New York. 
Selleri, F. (1971) Foundations of Quantum Mechanics Proceedings of the IL Enrico International 

Summer School, B. d'Espagnat, Ed., Academic Press, New York. 
Wigner, E. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi 

International Summer School, B. d'Espagnat, Ed., Academic Press, New York. 
Zeh, H. D. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi 

International Summer School, B. d'Espagnat, Ed., Academic Press, New York. 

The Correspondence Principle and Measurability of 
Physical Quantities in Quantum Mechanics 


Institute of Space Research U.S.S.R. Academy of Sciences, Moscow 

I. Introduction 

Fifty years have passed since the discovery of the uncertainty principle by 
Heisenberg. Quantum theory has scored big successes in many regions of 
physics especially in atomic and nuclear physics and solid-body theory. All 
physicists are agreed upon the formalism of quantum mechanics. But there is 
no agreement on questions of the interpretation of quantum mechanics and 
measurement theory. Most physicists merely ignore these problems, correctly 
believing that they are negligible in calculations of different quantum systems. 
There are many shades of interpretation of quantum mechanics and problems 
of measurement even in orthodox quantum theory. Some different viewpoints 
are found in numerous hidden-variable theories. Bibliographies on problems 
of quantum mechanics measurement and interpretations can be found in 
surveys by Margenau (1963), Pearle (1967), Ballentine (1970) and Reece 

In this paper I am going to consider only the question about constraints 
imposed upon the measureability of physical quantities by the time-energy 
uncertainty relation. I shall not consider other questions connected with the 
interpretation of quantum mechnaics and measurement theory and shall only 
make a few remarks about them. I shall confine myself to the statistical 
interpretation of quantum mechanics.* According to this interpretation a wave 
function provides a description of the statistical properties of an ensemble of 
similarly prepared systems. Ballentine (1970) ascribed this interpretation to 
Einstein (1949), Popper (1959) and Blokhintsev (1968). In my opinion, the 
same interpretation was upheld by Mandelstam (1950) in his brilliant lectures 
on the theory of indirect measurements, which were given in Moscow State 
University in 1939 and were printed only in 1950. 

The statistical interpretation differs from the Copenhagen one (Heisenberg, 
1955), which asserts that the wave function provides a complete and exhaustive 

*This terminology is used by Ballentine (1970). 


110 Uncertainty Principle and Foundations of Quantum Mechanics 

description of an individual system. Ballentine (1970) has shown that the 
stronger constraint, which is used in the Copenhagen interpretation, is of no 
importance in applications of quantum mechanics, but in some cases leads to 
paradoxes. All sensible quantum mechanics statements, being statistical ones, 
always concern no single dynamical system but an ensemble of systems. For 
instance, the statement that a measurement of spin component <r x in the state 
with a x = 1/2 gives the result value -1/2 with probability 0.5 is pointless, if it 
concerns only one single system. Really, this statement means that, measuring 
the component a x many times in the similarly prepared state, one gets the value 
ax = -1/2 in half of the cases. But many measurements cannot be performed 
on one system, because its state is perturbed after the measurement. For this 
reason it is necessary to have an ensemble of systems and to perform measure- 
ments upon many of them. There is another possibility: preparing the same 
initial state of the same individual system, to measure repeatedly. But this is an 
ensemble also. 

The proposition, that there are sensible statistical statements about 
indvidual systems in quantum mechanics, is an illusion. To make a statistical 
statement sensible, it is necessary to deal with an ensemble of systems. Hence, 
there is no reason to insist that a wave function describes a state of an element 
of an ensemble (a single system), but not the ensemble as a whole. However, if 
the Copenhagen terminology is understood not word for word but as a peculiar 
physical slang, used for brevity, then there is no objection to it. 

Following von Neumann (1932), let us give now the main statements of 
quantum mechanics. 

(Al). Any state of an ensemble consisting of many identical systems is 
represented by a definite self-adjoint operator U (called a statistical operator 
or state operator), which is denned in a Hilbert space X. The operator U has a 
real non-negative eigenvalues and obeys the conditions 



(A2) An observable physical quantity R is represented by a self-adjoint 
operator R in the Hilbert space X. A physical quantity F(R) is represented by 
the operator F(R). In particular, the canonical variables q',pt (i = 1, 2, . . . n)of 
a classical system are corresponded by operators q, p t (i = 1, 2, ... n) in the 
space X, which obey the commutation relations 

*U -> ->/ its' 


where I is the unit operator on X and h is Planck's constant. A function F(q, p) 
of canonical variables q, p is represented by the self-adjoint operator F(q, p). 
The order of the operators is chosen in some way which is not fixed. 

(A3). The mathematical expectation (R ) of an observable R in the state U is 
described by the relation 

(R)0 = Sp{UR} 


Rylov 111 

(A4). The state U of the physical system ensemble evolves in such a way that 
after time t it turns into the state U, 

U-»U t = fT i " ,/h Ue fu "' (4) 

where H is a self-adjoint operator on X. The H, called the Hamiltonian, is a 
function H(q,p) of q and p. For physical systems, which have a classical 
description, the form of this function coincides with that of the Hamiltonian 
function of coordinates and momenta. 

(A5). Measuring a quantity R on the physical system ensemble which is in 
the state U, turns the state of R after measurement into a state U', which is 
defined by 

U+& = U&Xn,Xn)pM (5) 


where (iff, x) denotes the scalar product of vectors tff and x on %• 
Xu X 2. • • • > Xn ■ ■ ■ is a complete set of orthonormal eigenvectors of the operator 
R • P^n] is a projection operator on vector *„. The projection operator on a unit 
vector <p is defined by the relation 

where / is an arbitrary vector on X. 

The superselection rules (Wick and co-workers, 1952) must be added to 
these propositions. But I shall not do this, because I shall not use these rules, 
and the above propositions do not pretend to be an axiomatization of quantum 

In certain cases the physical system ensemble state can be determined by 
pointing a unit vector </f on X. Such states are called pure ones. Their statistical 
operator can be represented in the form 

U = P M (7) 

where P w is the projection operator on the unit vector if/. In this case due to (6) 
the statement (3) takes the form 

(R)<, = Sp{RP^ = I {x n , RPmXn) 

= I (*,*,)(*., £*)=(*,**) 


The relation (4) can be written as 

, , -iHl/h , 

</> -» tp, = e if/ 


The set of vectors t//-xi,X2, ■ ■ -Xn- ■ • represents an orthonormal basis in 
Hilbert space X. 

The vector i/r, describing a pure state of the physical system ensemble is called 
the vector state or wave function. All statistical statements of quantum 
mechanics can be derived from the propositions (A1)-(A5). The detailed 
analysis of these statements can be found in the monograph by von Neumann 

112 Uncertainty Principle and Foundations of Quantum Mechanics 

(1932). The formal measurement theory can be derived from the above 
statements. This was shown by von Neumann (1932). 

The relation (5), describing the violation of the ensemble state under the 
action of measuring a quantity R, represents the special process, which differs 
from the evolution of the ensemble, described by relation (4). 

The quantum mechanical process of measurement has two aspects. The 
informative side of the measurement of a quantity R represents a registration 
of some definite statistical distribution of values R in the given ensemble state. 
The perturbing side of the measurement describes the violation of the ensem- 
ble state after measurement. The last property can be used for the preparation 
of the system ensemble in a definite state. The two sides of measurement are 
independent to such an extent, that Margenau (1963) insisted on distinguishing 
between the measurement process (registration) and that of state-preparation. 

Really, the two sides of measurement are independent to such an extent, that 
they can be realized in principle by means of two different devices. One of them 
only prepares the state but does not record it, and the other only records but 
does not disturb it. Let us consider such an idealized measurement process. Let 
an ensemble E+ be described by the wave function t/» and consist of N similar 
single systems (iV-»oo). Let the preparation part M of the device, measuring 
the quantity R act on the systems of the ensemble E+. One can imagine the 

preparing part M as a black box with one inlet and many outlet slits Si, S 2 , 

Let all outlet slits be closed except for the slit S x . The JV X systems (1 « Ni « N) 
of the ensemble E+ find themselves in turn in the box M. A proportion of them 
is absorbed by the box and a proportion passes out through the open slit S x .* 
Let n 1 systems pass through the slit Si (1 « «i « AM- The box M transforms a 
part of the ensemble in E+ into another ensemble JEi, consisting of n x systems. 
In other words the box M prepares the ensemble E x in a certain state, which is 
not known exactly. In order to visualise the recording part of the measurement 
let us turn h systems of the ensemble E 1 (l«l u k 1 = n 1 -l 1 »l) into a 
recording macroscopic device 3" with a 'pointer'. Let us assume that the 
influence of the sytem upon the device 9" deflects the pointer. The magnitude of 
this deflection shows the measured value of the quantity R. Let us assume that 
the box M has such an arrangement that an analysis of h systems of the 
ensemble E\ by means of the measuring instrument 0> gives the result R i for all 
these systems. We shall not be interested in what happens to the systems in the 
instrument 9 and shall not consider them. Because the measurement has given 
a value R x of the quantity R for all l x systems (h » 1), we have the right to 
conclude that the quantity R has the value i?i for the rest of the ensemble E r 

*For concreteness one can imagine any physical system as a moving electron. The measuring device 
measures the electron momentum. The black box M represents a region with a magnetic field, 
which is orthogonal to the direction of the electron motion. Depending on the magnitude of the 
electron momentum the electron is deflected through a certain angle and hits the screen with the 

slits S„ S 2 , If only slit S x is open, then such a device prepares an ensemble of electrons with 

momentum p!. 

Rylov 113 

(fci systems, k x = n x - h » 1)1". This conclusion can be drawn without analysing 
these fci systems by measurement with instrument 0*. 

Thus by subjecting N x systems of ensemble E^, to dynamical interaction in 
the box M and analysing part of them in the instrument &, we get the ensemble 
Eu consisting of fei systems. The quantity R has the value Ri in all systems of 
ensemble £,. If the ensemble E^, is a pure one then the ensemble Ei must be a 
pure one and be described by a certain wave function fa, because the ensemble 
systems were subjected only to dynamical interaction with the box M. 

But the measurement process on the ensemble E$ is not finished, because 
only the one possibility that some systems have value R=Ri has been 
investigated. For the investigation of the possibility that among the systems of 
the E# there are ones, which have the value R 2 of the quantity R, it is necessary 
to open only the slit S 2 and to allow to pass through M N 2 systems (N 2 » 1, 
N 2 « N) of the ensemble E^,. Passing through the slit S 2 n 2 systems (n 2 » 1) of 
the E*, one forms an ensemble E 2 . Analysing l 2 systems (l 2 »l,k 2 = n 2 -l 2 »l) 
of the E 2 and finding a value R 2 of the quantity R for all of them one concludes 
that the rest of the systems of E 2 have the value R=R 2 and are described by a 
wave function fa. An analogous process has to be produced for all slits 
Si, S 2 , . . . . Let us note that the distribution of the ensemble E^, systems 
according to values of R is produced independently of reading the pointer 0>, 
i.e. in the absence of any information about the state of the ensemble E^,. Of 
course, the box M is supposed to be arranged in such a way that any slit S m 
corresponds to a definite value R = R m . 

Let us suppose that all N m (m = 1, 2, . . .) are equal and that the box M is 
arranged in such a way, that by all open slits Si, S 2 , ... a system of the ensemble 
E#, finding itself in the M, is to go out through one of the slits. Supposing that 
l m = an m , where < a < 1 and the a is the same for all m = 1, 2, . . . , one 
concludes that the number k m (m = 1, 2, . . . ) of systems in the ensemble E m is 
proportional to the probability of measuring the value R as equal to R m among 
the systems of the E^,. 

Let us unite all ensembles E u E 2 , . . . into one ensemble E'. The E' is a 
mixture of pure ensembles E u E 2 , . . . and cannot be described by means of a 
wave function. It is conditioned in that the E' is obtained as a result of a set of 
different dynamical actions but not one dynamical action. Really, some systems 
are subjected to the action of the box M with the only open slit Si, other ones 
with slit S 2 only open and so on. It is different dynamical actions, which lead to 
different results. 

I shall call this action the statistical action keeping in mind that the statistical 
character of the action manifests itself in a different dynamical action of the 
measuring device upon systems with different values R. 

Of course the objection can be made to the above that it is not necessary to 
open only one slit in the box M. One can open all slits or even not use the box M. 
One can merely take a small part of the ensemble E+ and investigate the 

tThe concept of ensemble was introduced into quantum mechanics in order to know the state 
without perturbing it (von Neumann, 1932, Chap. 4, Section 1). 


114 Uncertainty Principle and Foundations of Quantum Mechanics 

distribution of the values R by means of the measurement instrument $P. The 
same statistical distribution of the values R will be found in the rest of the E+. 
Thus, one can find the distribution of values in the E+ without disturbing it. This 
is a valid objection. But it takes into account only the informational side of 
measurement, neglecting the state-preparing side. Essentially it is equivalent 
to the perfectly correct statement, that our knowledge about the ensemble state 
does not have an influence upon the state of the ensemble. However the 
measurement is not reduced merely to a change of information about the state 
of the ensemble. The measurement influences the ensemble state. All physi- 
cists are agreed upon this question. There is discordance of opinion only upon 
the question of how it influences the state of the ensemble. Influence of 
measurement upon the state of a system being measured is the main difference 
between the quantum theory of measurement and that in classical theory. 

Let an electron ensemble state be described by a wave function iff. Then 
|(Hq)| 2 d V represents a probability of finding the electron in the volume d V. To 
measure the electron position means, that it is necessary not only to measure a 
distribution |e/r(q)| 2 for the ensemble, but to determine the action of this 
measurement upon the ensemble description. To measure the electron position 
and to find it in the volume dV means selecting all the electrons of the 
ensemble, which have been found in the d V, to constitute a new ensemble of 
them and to solve the problem of the description of this new ensemble. Just 
such a problem arises in quantum measurement theory. For the elucidation of 
the nature of this problem I have divided the united measurement process into 
two parts: an informational part and a state-preparation part. Such a division is 
an idealization, which is possible only if the state preparation and the recording 
process happen instantaneously, and if the change of the ensemble state due to 
the process (4) can be neglected. 

Usually the measurement device cannot be divided into parts: the informa- 
tional one and state-preparation one. Besides nobody measures in the way that 
has been described, i.e. firstly the systems having the value R = R i are selected, 
the rest of them being given up, secondly the systems having the value R=R 2 
are selected, the rest of them being given up and so on. Such a selection, 
produced blindly, is ineffective. In practice the measurement is performed in 
the following way. One finds the value of the quantity R for a single system. 
Depending on the value obtained for R, the system is attributed to one of the 
ensembles E U E 2 , For this reason some physicists believe that the appear- 
ance of a mixed state of the ensemble is connected with a change of information 
to an observer. Other authors connect its appearance with the fact that the 
measuring device is a macroscopic one. Some physicists reject the reduction 
of the pure ensemble state to the mixed one, stating (quite correctly), 
that dynamical action cannot reduce the pure state into a mixture (Wigner, 
1963). If in addition to considering that the wave function describes a state 
of a single system and to understand this word for word, then the measure- 
ment action upon the system to be measured assumes in general a mystical 

Rylov 115 

Thus, the quantum measurement process is a set of single measurements. 
This term will be used later on just in this sense. To measure the quantity R and 
to obtain a value R' means performing a set of single measurements of R, the 
selection of those systems for which the measurement has yielded the result R' 
and the constitution of a new ensemble of the selected systems. The measure- 
ment is in the first place a statistical action upon the ensemble systems, which 
can be accompanied by a dynamical one. The statistical action of measurement 
is conditioned by the statistical character of its description in quantum 
mechanics.* The system selection is an attribute of a measurement. The means 
and manner of how this selection is produced is of no importance. In any case 
this selection is not a result of a change of observer information about the 
ensemble state, because, as we have seen, this selection can be produced 
blindly without any information about the ensemble state. 

The relation (5) describes a result of the measurement action upon the 
ensemble in the state U. The measurement is supposed to be performed 
instantaneously and the state evolution, described by (4). can be neglected. 
One can have doubts, that the measurement action is described by a projection 
operator P^„] upon eigenvectors {%„} of the measured quantity operator, and 
propose another way. I believe that this is not a very essential detail. I have 
chosen the measurement action in the form (5) for the reason that its properties 
have been investigated in detail by von Neumann (1932). 

Unfortunately, the measurement problem is not exhausted by this consider- 
ation. Quantum mechanics always attributes a result of measurement to a state 
U of the ensemble to be measured. For such an attribution to be possible, the 
measurement would have to be performed sufficiently quickly, in principle, 
instantaneously. This means the following. Let the measurement of the 
quantity jR continue during the time T and a set {R T } of results be obtained. 
This set {R T } depends in general on the duration T of a single measurement. If a 
limit of distribution {R T } with T-» exists, then by definition, the measurement 
of the quantity R can be performed instantaneously. For an instantaneous 
measurement the result can be attributed to the state U, in which the ensemble 
has been found directly before measurement, even if the ensemble state has 
changed during the measurement process. 

However, it is possible that some quantity R cannot be measured instantane- 
ously, i.e. no limit of distribution {R T } of measurement results exists for T-» 0. 
For instance the energy and momentum of a particle are such properties. In 

*The action of measurement upon the ensemble state takes place in the theory of Brownian 
motion, where the dynamical action of measurement can be neglected certainly. For instance, let an 
ensemble E w of Brownian particles be described by a function W(q, t) satisfying the Einstein- 
Fokker equation. To measure the position of the Brownian particle at the same time f and to find it 
in a volume V means selecting from the ensemble E w only those particles which have been found 
in the V at the time t = t , and to constitute a new ensemble E Wo , described by the function 
Wo(q, /), which does not vanish only within V at t = t . Later at t > t the ensemble E Wo will 
evolve in a different way from the ensemble E w . In other words, the measurement at the time / 
changes the probability of detecting the Brownian particle at the point q at the time t >t although 
no dynamical action has been made upon the particle, and only selection (i.e. statistical action) has 
been effective. 

116 Uncertainty Principle and Foundations oi Quantum Mechanics 

accord with the uncertainty principle the smaller the measurement time Tis the 
greater is the inaccuracy of the measurement of energy and momentum (Bohr, 
1928; Heisenberg, 1930; Landau and Peierls, 1931; Mandelstam and Tamm, 
1945; Fock and Krylov, 1947; Aharonov and Bohm, 1961; Fock, 1962). For a 
non-vanishing time T the measurement result cannot be attributed to the 
ensemble state directly before the measurement. Really, if this were possible, 
then it would be unclear why the measurement results have no limit for the 
non-vanishing measurement time T->0. The measurement results can be 
attributed to the ensemble state U during the measurement process only if the 
U is unchanged (or is changed very slightly) during the measurement time. 
Thus, if the measurement requires a non-vanishing time, then its result can be 
attributed to the ensemble state only for those states for which the change 
according to the relation (4) is negligible during the measurement time. 

Using the relations (4) and (5), let us produce a formal consideration of the 
measurement process, continuing the time T during which the measurement 
instrument is switched on. Let the measurement of the quantity R be per- 
formed within the period [0, T] and the ensemble state be described at the time 
instant t = by a statistical operator U . Let the operator U evolve within the 
period of time [0, t] according to (4), turning into U, at the instant t. Let the 
measurement process (5) be performed at the moment t the U, turning into UJ. 
Let the U', within the period [t, T] evolve according to (4), turning into U', at the 
instant T. A simple calculation shows that during the time T the statistical 
operator Uo turns into \J' T 



U -» fVr = Z (V e ifI ' /h Xn, e iH ' /h Xn)Vei e -^~'»\j 

where Xu X2, ■ ■ ■ is a complete orthonormal set of the operator R eigenvectors. 
The instant t, at which the measurement process is performed, is supposed to be 
indefinite but within the period [0, T]. 

For the described process to be a real measurement of the quantity R it is 
necessary for the state Urdepends only on the initial state U and the operator 
R. In particular the U' T has not to depend on the instant t, at which the 
measurement (5) has been performed. It follows from (10), that the last 
condition is fulfilled, if the vectors {*„} are eigenfunctions of the Hamiltonian H 
and, hence 

[R,ft]-=RH-M = (11) 

This result can be found in von Neumann's book (1932, Chap. 5, Section 1). It 
means, that the action of the measurement device upon the measured system 
during the measurement process is to be of such a kind, that the Hamiltonian H 
would begin to commute with the operator of the measured quantity. 

Suppose the relation is fulfilled. The question arises of to which state the 
measured values should be attributed. The instant of measurement is indefin- 
ite, and the ensemble state U changes within the period [0, f] according to (4). 
For the measured values of the R to be attributed to a definite state U, it must 

Rylov 117 

be stationary within the period [0, f].* This leads to a condition 

[U o ,H]_ = (12) 

which is a great restriction upon U . In particular the measurement of the 
distribution over the momenta of a free particle is possible (in the one- 
dimensional case) only if the momentum value is quite definite. 

The above formal consideration is not consistent because on the one hand it 
uses the instantaneous measurement process, but on the other hand the 
measurement is supposed to continue for a non-vanishing period of time. 
Nevertheless it indicates that a long duration of measurement of some quan- 
tities should give rise to obstacles to their measurability. 

The necessity of a long measurement of physical quantities of the energy- 
momentum pattern is conditioned in the end by the time-energy uncertainty 
relation. Unlike the position-momentum uncertainty relation (Heisenberg, 
1927; Robertson, 1929) it cannot be derived from the statements (l)-(5) of 
quantum mechanics. Really, the formalism of quantum mechanics contains the 
time t as a parameter, which commutes with the Hamiltonian H and does not 
conjugate to H in the sense (2), as q and p do. Thus, the time-energy 
uncertainty relation is an additional statement, which should be taken into 
account in the formalism of quantum mechanics. It does not permit the 
instantaneous measurement of the energy-momentum pattern quantities 
and leads to the restrictions (11) and (12). All this is in contradiction 
to the basic statements (l)-(5) of quantum mechanics, which supposes 
the instantaneous measureability of physical quantities in any state, and 
apparently is connected with the non-relativistic character of quantum 

The subsequent analysis shows, that in reality relation (12) cannot be fulfilled 
for quantities of the energy-momentum pattern. In other words, the measured 
values of energy and momentum can never be attributed to any definite 
ensemble state. In this sense energy and momentum are not measureable, and 
this is the corollary of the time-energy uncertainty relation. 

Later on I shall use the coordinate representation of vector state. In this 
representation every vector iff in the Hilbert space $? is represented by a 
square-integrable function iff of the coordinate q. The scalar product (tp, tff) of 
two vectors tp and iff is represented by 



* denotes a complex conjugate. Integration is produced over all coordinates q. 
The position operator q and the momentum operator p are defined respectively 
as the operator of multiplication by q and as the differentiation operator 

A = -«^ 


*Within the period [t, T] the state V', is stationary due to (1 1). 

118 Uncertainty Principle and Foundations of Quantum Mechanics 

2. The Possibility of an Experimental Test of the 
Statistical Statements of Quantum Mechanics 

In the present section I shall investigate the possibility of an experimental test 
of the statistical statements, represented by (A2) and (A3). Because this 
problem is very complicated, I shall confine myself to the investigation of a 
measurability of the simplest physical quantities such as coordinate and 
momentum. The complication consists of the impossibility of producing a 
formal analysis. For instance in measuring a momentum on the one hand it is 
stated that the momentum operator is represented by (14) and this operator has 
to be used in (8) for the calculation of the corresponding mean values, on the 
other hand it is necessary to describe some measurement process which is by 
definition the momentum measurement. If the measurement results disagree 
with the statements (8) of quantum theory, then its proponent can always say 
that this measurement process is not a momentum measurement, and for 
momentum measurement another measurement process should be used. 

To reduce the number of possible measurement processes I shall require that 
any mesurement satisfies the correspondence principle. This means, that a 
measurement process and its result must not depend on the model of the 
phenomena used in the measurement (classical, quantum or some other kind). 
In particular, the measurement process applied to a system which permits a 
classical description must give results, which agree with classical mechanics. 

The term 'correspondence principle' was introduced by Bohr (1918) to 
establish a connection between the old (before 1925) quantum mechanics and 
the classical theory of radiation. Originally it meant that the radiation fre- 
quency emitted by transition from one quantum orbit to another approaches 
asymptotically one of the frequencies which are obtained from a Fourier series 
of functions describing the motion of the electron. 

In the contemporary version of the quantum mechanics the correspondence 
principle describes some correspondence between the formalism of quantum 
mechanics and that of classical mechanics. For instance, the operator p = 
- ihd/dq which is unlike the classical momentum, is interpreted as momentum 
and in many cases (for instance in approximate estimations) is substituted by 
the classical momentum. This is the corollary of the correspondence principle. 

The following consideration is a base for such a correspondence. Let a 
particle ensemble be described by the wave function 

/ (iS(q)) 

i/r = i/,(q) = -Jp(q) exp j— — | 


where p and 5 are real functions of q. For simplicity only the one-dimensional 
case is considered. If p and p = dS/dq change slightly within the distance of a 
wavelength A = h/p, i.e. 

1 dp 




« — 



Rylov 119 

then such an ensemble can be described classically with S(q) as an action. The 
density state of such an ensemble in a phase space (q, p) is described by the 
distribution function 

W{q,p) = p{q)8(p- d A 


Indeed, one obtains from (8) and (15)-(17) for 

(F(q, p% = j" **(q)F(q, ~ ih ^{q) dq = 

= | p{q)F{q, f)*q = \ Hq, p)W(q, p) dq dp (18) 

For this reason one concludes, that the operator (14) corresponds to momen- 
tum. The momentum is interpreted in the sense of classical mechanics. The 
function W(q, p) describes an ensemble of classical systems. The state of every 
system is described as a 'point' in the phase space. Every 'point' has a volume, 
which is greater than the characteristic volume ft of the phase space. The 
essential dependence of distribution (17) on the only coordinate q is con- 
ditioned by the fact, that the ensemble state is pure, i.e. the ensemble state is 
described by a wave function, not by a mixture of them. 

To avoid a measurement process description for every single quantity it is 
natural to use the rich experience of classical mechanics. The correspondence 
principle is used for this purpose. For instance, it follows from the correspon- 
dence principle that the quantity p described by the operator (14) is to be 
measured in the same way as a momentum is measured in classical mechanics. 
The measurement connects the quantum mechanics formalism symbols with 
phenomena of the real world and puts a content and a sense into these symbols. 
Referring to classical mechanics, the correspondence principle formalizes a 
relation between the formalism of quantum mechanics and measurement. The 
measurement procedures for different quantities are supposed to be worked 
out in classical mechanics. Because the correspondence principle is the least 
formalized part of the theory, I shall make it responsible for a possible 
disagreement between experiment and the quantum theory formalism. This 
means that the experimental test of the statistical statements of the quantum 
theory is considered as a test of the correspondence principle. 

While relation (8) describes only expressions for mean values if it is valid for 
all self-adjoint operators, then, as von Neumann (1932, Chap. 4) has shown, it 
contains all the statistical statements of quantum mechanics and permits the 
calculation of the probability of measuring a given value of any quantity R. In 
particular it follows from (8), that in every single measurement one can obtain 
only that value R', which is an eigenvalue of the operator R. 

Suppose, for instance, that a self-adjoint operator R has a discrete spectrum 

of eigenvalues R u R 2 , Let for simplicity every eigenvalue /?, (i = 1, 2, . . .) 

be related to only one eigenvector Xi- The vectors corresponding to unlike 

120 Uncertainty Principle and Foundations of Quantum Mechanics 

eigenvalues are orthogonal. Let us normalize them in such a way, that 

fa Xk) = } **(qkfc (q) d q = «* 

j, fc = 1, 2, . . . (19) 
An arbitrary wave function iff can be represented in the form 

*(q) = I«Wi(q) ( 2 °) 

It follows from (19), (20) and the normalization condition of the wave function 
iff, that 

Iflf*=Ik| 2 =l < 21 > 

i i 

Let us calculate the mean value (F(R))+ of a function F of the quantity R. 
Because the \xt} are eigenvectors of the operator R 

RXi=RiXh ' = 1,2, ... 

one obtains from (8) and (20): 

<F(i?)>*=XF(i?,)H 2 


Since the (22) is valid for every function F, then it follows from (21) and (22) 
that the quantity R can take only values R t (i = 1,2,.. .). The |a,| is the 
probability that the quantity R has a value R t in the state (20). Thus any single 
measurement of the quantity R has to yield one of the eigenvalues /?, of the 
operator R. It is appropriate to point out, that the last statement follows from 
(A3) only in the case when the condition (A3) is fulfilled for all operators R. 
Let us consider the problem of particle-position measurement. For simpli- 
city, the particle is considered to be charged (for example an electron). Let 
there be some macroscopic device (generator), which prepares an electron in 
some state. For instance, an electron gun can serve as a generator. Let us 
imagine many similar generators (an ensemble) which are in the same macro- 
scopic state. Every generator prepares an electron upon which a single 
measurement is performed. Single measurements performed upon different 
electrons yield, in general different results. The complex of single measurement 
results permits the determination of the quantity distribution in a state ijr in 
which we are interested. Let a detector, capable of recording the time at which 
an electron passes through it, be spaced some distance from the generator. The 
detector can be represented by a Geiger counter or other similar device. Let t 
be the operating time of the detector. Let a set of experiments be performed. 
During each experiment the generator is switched on, and, if the detector trips, 
then it records the time of its tripping. If the detector trips in a period t after 
switching the generator on, then this means, by definition, that during the 
period (t - r, t) the electron was found in the volume of the detector. Thus the 
electron coordinates in this period coincide with the detector coordinates 
within the precision of the detector size. 

Rylov 121 

Let a set of N experiments be performed at a fixed position of the detector. 
Let N be the number of times the detector has not tripped, Ni the number of 
times it has tripped during the period t [i.e. in the period (0, t)], N 2 the number 
of times it has tripped in the period 2t and so on. We have 

the limit 

N = No + N! + N 2 + ... 

lim N s /N 


represents the probability of detecting an electron in the time st within the 
space taken up by detector. For this reason one has 


|*(q,«0| 2 dV = 



where i/r is the electron wave function, q are the detector coordinates and d V is 
its volume. By performing measurements with different dispositions of the 
detector with respect to the generator, the |^(q, st)\ 2 can be calculated for 
different positions q and times st (s = 1, 2, . . .) within the detector size and 
operating time t. The detector size and its operating time are reduced as far as 
possible in order to increase the accuracy of the calculation of \ift(q, st)\ . An 
optical or electron microscope can be used if necessary. I shall not describe the 
measurement of the particle position by means of the microscope but refer to 
the paper by Mandelstam (1950). It should only be noted that by using a 
photon (electron) beam of sufficiently high energy, the position and the 
registration moment of an electron can be determined in principle with any 
desirable accuracy. This means that the electron coordinates can be measured 
with arbitrary accuracy and instantaneously. 

Of course, a statement of such a kind is an idealization of a real state of 
affairs. Nevertheless I shall adopt this thesis, remembering that within non- 
relativistical physics nothing hinders in principle the accurate and instantane- 
ous measurement of the electron position, if the energy is not too high and pairs 
generation can be neglected (see, however, Landau and Peierls, 1931; Pauli, 
1933, Section 2). 

Let us consider the problem of the measurement of electron momentum. In 
classical mechanics it is supposed that the influence of measurement upon 
electron motion can be made infinitesimal. So to measure the electron momen- 
tum it is sufficient to measure two neighboruing positions q and q+Aq of a 
single electron which are separated by a short period of time At and to calculate 
the electron velocity v = Aq/At. Thereafter the momentum p is defined by 

p = mv = m 




Formula (23) determines the mean momentum over a period of time At. 
Measuring momentum in reducing periods of time At one obtains in the limit 

122 Uncertainty Principle and Foundations of Quantum Mechanics 

the exact value of the momentum. In this case the limit Aq/Af with At -*0 is 
supposed to exist. 

In quantum mechanics such a method of measurement is also possible. As 
the position can be measured, in principle, instantaneously, it is possible to 
measure two positions of an electron at two instants separated by a short period 
of time and then to use formula (23). By definition, the measurement result is a 
momentum averaged over a time period At. Let us suppose, that a set of such 
measurements is performed upon an electron ensemble described by a wave 
function ifi. A number of momentum values, described by a spread Ap, is 
obtained. Generally, the spread (uncertainty) of momentum depends on the 
wave function before the first position measurement, but in any case due to the 
uncertainty principle 

|4 " |2 M 


where |Aq| is the distance between the position of the electron in the first and 
second measurements of its position. As \Aq\ < c At, where c is the speed of light 
then it follows from (24) that 



Thus, unlike classical mechanics a reduction in the time of measuring moments 
m leads to an increase of momentum uncertainty independent of the form of 
the wave function, which has described the electron ensemble before measure- 
ment. Although nothing prevents a set of measurements of momentum being 
averaged over a short period of time At, one cannot assert that the resulting 
momentum values represent those of an electron ensemble described by some 
wave function. These momentum values cannot be attributed to the wave 
function, which described the ensemble directly before measurement, because 
the measurement result depends on the measurement duration At (with 
Af-»0). The momentum values cannot be attributed to the wave function 
arising during measurement, because, however short the period is, the wave 
function changes in this period essentially, and that moment, to which the 
measured values should be attributed, is unknown. 

Thus, although a measurement can be made, the results cannot be connected 
with any wave function, and, hence with the statistical statements of quantum 

Let us consider the measurement of momentum averaged over a long period 
of time T. Let there be a generator localized in some region ft with linear 
dimension of order Aq. The generator prepares an electron ensemble in a state 
described by a wave function ty. Let the electron be detected at a distance q 
from the generator in time T after switching the generator on (\q\ »Aq). Let us 
assume, by definition, that the average of the electron momentum over a period 

Rylov 123 

T is measured. It is defined by 



Essentially the relation (25) is the relation (23) which is used for the case when 
T is much more than the characteristic time of evolutional change of the wave 
function. The inaccuracy Ap of momentum measurement is determined by the 

Ap = 



For a fixed generator size the inaccuracy is less the longer the measurement 
time. I shall call momentum defined by relation (25) q-momentum (from the 
word 'quantum') in contrast to the momentum defined by the relation (23) with 
Af-»0. 1 call the latter c-momentum (from the word 'classical'). 

In the case of the absence of an electromagnetic field the q-momentum 
distribution can be determined by performing a set of single measurements of 
q-momentum. This distribution is determined by the relation 

W(p)dp=|(Ap| 2 dp (27) 

where W(p) dp is the probability of measuring a momentum p within the region 
dp = dp! dp 2 dp 3 and 

<A» = 




«Mq, t) dq 


is the Fourier-component of the wave function. 

It should be noted that the measurement of q-momentum (25) is reduced to a 
position measurement at the moment t = T. The wave function of the free 
electron evolves in such a way that for t long enough the form of \tf/(q, t)\ 
determines |iApW| • 

I shall show this in the simple example of one-dimensional motion. Let the 
wave function at the initial moment have the form 

V2<n-fc J 

c ipq/ % dp 


*p = 



r A 2 (p-p ) 2 ] 
I 2 ft 2 J 


where A is a constant representing an effective width of the wave packet, p is 
the mean value of the q-momentum. According to (27) and (30) the probability 
of detecting q-momentum p within the range dp which is 

W(p) dp = fcf d P = -A= exp { - A JP_^2L} dp 



124 Uncertainty Principle and Foundations of Quantum Mechanics 

has a normal distribution form with mean value p and dispersion h /(2A ). At 
the initial moment the packet centre was found at the point q = 0. With the 
passage of time the wave function evolves according to Schrodinger's equation. 
It turns at the moment t into 

<Jt(q,t) = - 



^/ftM^r/am*)^ dp 


Substitution of (29) onto (32) and calculation yields the result 

/ A [ (q-pot/m) ippt ip g| 

* ( *°" V 2 +ito/mh/^ eXP l 2A 2 (l + /WA 2 m) 2mh h J 


Hence, one gets for the probability d W of detecting the particle in the vicinity 
of the point q within range dq at the moment t 

dW=\+(q. tf dq = V ^AVftV/m 2 ) CXP r^TWJ^i dq m 

For / »mA 2 ft _1 the coordinate distribution reproduces the initial distribution 
(31) over q-momenta. Indeed in accordance with (25) assuming 



and substituting q from (35) into (34), one gets 

A f A 2 (p,-poV 

h 2 B 

W(p,) dp, = T7 ^ : exp [ - t2n2 







m A 


n t 




If f->oo, then B-*\ and the q-momentum distribution, obtained from the 
coordinate distribution at the moment /, coincides with the distribution (31). 

It follows from the example, that the q-momentum distribution subsequently 
turns into a coordinate distribution and can be measured. 

It should be noted that measuring |i^ p | 2 in such a way, we have no right to 
assert that the momentum distribution is measured in any definite state. The 
fact is that the wave function has changed essentially during measurement time. 
At first it has been localized in the vicinity of the generator and then it spreads 
over space. By measurement one obtains only time-independent characteris- 
tics of the wave function such as amplitudes |^p| . 

Thus the above manner of q-momentum measurement does not permit the 
measurement of momentum or momentum distribution in the state described 

Rylov 125 

by a wave function. The best that can be measured is the momentum distribu- 
tion averaged in some way over states with different wave functions. In the end 
it is connected with the fact, that due to the uncertainty principle the precise 
measurement of momentum needs a long time in which the wave function 
changes essentially. 

Let us consider momentum measurement based on the law of the conserva- 
tion of momentum. For the measurement of the electron momentum a particle 
of mass M is placed in the electron path. By collision with the particle the 
electron is captured by the particle and passes its momentum to it. In measuring 
the particle momentum, by definition, one measures the electron momentum at 
the instant of impact. For the electron momentum to be measured with an 
accuracy Ap, it is necessary that the initial momentum of the particle is of the 
order Ap. According to the uncertainty principle this is possible only if before 
collision the particle is placed within a region with line size of the order Aq and 




For simplicity I shall consider only one dimension. Let the wave function 
describing an electron ensemble having the form of the wave packet (32) with 
spread L. The time uncertainty of hitting an electron with a particle is 
determined by the relation 




where m is the electron mass and p is its momentum. If one measures the 
momentum long enough it can be determined with great accuracy. Thus, the 
uncertainty of electron momentum measurement is conditioned only by the 
uncertainty of the initial momentum of the particle. 

For the measured electron momentum to be attributed to any definite wave 
function, it is necessary that the wave function changes slightly during the 
measurement time At. In the optimum case the spread dp of i/f p in the region of 
the variable p is determined by the uncertainty relation 




During the time At the phases 

(f> " = { Pq -^)/ h 


corresponding to unlike unvanishing Fourier-components of t// p change. The 
greatest phase difference arising during At is 

A<p = 



126 Uncertainty Principle and Foundations of Quantum Mechanics 

Substituting (39) and (40) into (42) one gets 

A? a 1 (43) 

This means that the wave function always changes essentially during the 
measurement time, and the measured values of momentum cannot be attri- 
buted to any definite wave function. They can be attributed only to an ensemble 
state averaging in some sense over the period At. In this sense the q-momentum 
distribution cannot be measured in this way. The reason preventing this can be 
formulated as the time-energy uncertainty relation. Indeed, the relations (42) 
and (43) can be formulated as 





where AE is the uncertainty of the electron energy. 

Let us consider momentum measurement based on the Doppler effect. Let 
an atom in an excited state radiate a photon with frequency w . If the atom 
moves, then the photon, radiated into the direction of motion, has a frequency 

—o( 1+ 7) 


where V is the atom velocity. Measuring the photon frequency, one can 
determine the atom velocity from (45) and hence the atom momentum. The 
least error Aw of the frequency determination is given by the relation 




where T is the measurement time, i.e. the period during which the objective of 
the spectrometer being used for the photon frequency measurement is open. 
Let the atom ensemble be described by a wave function, having the form of a 
wave packet with spread L. Again for simplicity the one-dimensional case is 
considered. Supposing that for the atom speed V« c, one gets for the uncer- 
tainty Ar of the photon radiation time 

At = T+- 


Caculating the atom velocity by means of (45), one obtains the following 


McAw Mc , A0 . 

Ap = MAV= a — - (48) 

O) 0)0 1 

for atom momentum error measurement. Let us write the wave function of the 
atom in the form 

*<«-' >= ;/ss! e " 

ipq m-(iE^m. dp 




Rylov 127 

where E p is the energy of the atom with momentum p. At the moment of 
photon radiation the atom energy reduces by tuo. Photon radiation can be 
produced at any moment of the period At. This entails a phase uncertainty A<p p 
of one of the Fourier-components iff p , which is determined by 

hcoAt (o 


Thus the phase difference between different i// p changes essentially in the 
period At. This means, that the wave function changes essentially during 
measurement time At. 

Thus, the measured momentum cannot be attributed to any definite wave 
function, i.e. the momentum cannot be measured in the sense that it is 
customary to treat measurement in quantum mechanics. The best that can be 
done is to state that the measured momentum distribution is attributed to a set 
of wave functions having constant modules \ty p \ of Fourier-components and 
indefinite phases. Statements of such a kind are absent in conventional 
quantum mechanics. Apparently they do not contradict quantum mechanics, 
but one cannot say that they confirm it. 

The distrubition |e/f p | 2 over q-momenta can be obtained, because it does not 
involve any wave function describing a free particle, but this is not the 
measurement of momentum distribution. The restriction of momentum meas- 
urement is born not from the impossibility of measuring momentum but from 
the impossibility of attributing measured values to any definite state. 

All the ways considered of measuring momentum fit the case when the 
particle motion obeys the laws of classical mechanics. In other words the 
method of measurement does not depend on the model which is used for the 
explanation of physical phenomena. In classical mechanics also it is necessary 
to attribute the measured value to a definite state but in this case there are no 
wave functions and no uncertainty principle. It is values of coordinates and 
momenta that determine a state. For this reason in classical mechanics the 
problem of attributing measured values to a definite state does not exist. 

Let us consider the measurement of the component pi of the momentum of a 
free particle at the point q. Let the free-particle ensemble be described by 
means of a wave function *//. Let us measure the particle position at the moment 
t + T. In principle this is possible. Let us select only those single measurements 
which have given the particle position in a small vicinity of position q at the 
moment t. Let these single measurements constitute the ensemble E q . Suppose 
further in these cases that the particle position measurement at moment t + T 
gives a result q+Aq, where Aq is generally different for different single 
measurements. Let us suppose that each single measurement determines the 
component p x of the particle momentum at the point q. It is determined by the 

A* 1 

P\ ^ - 



128 Uncertainty Principle and Foundations of Quantum Mechanics 

Performing many single measurements, one gets a distribution of the momen- 
tum component p x at the point q. This distribution depends essentially on the 
choice of the period T between two consecutive positions of the particle. The 
shorter the period T the more precise the determined position of the particle is 
and the more energetic the beam of sounding particles (electron or photon) to 
be used for the particle position determination. As a result the particle motion 
will be disturbed and the distribution over component p, will be distorted. The 
shorter the period, the greater is the momentum dispersion of the particle. This 
means that the distribution over component p t of the particle momentum 
cannot be measured. The criterion of such an impossibility is the dependence of 
the distribution on the period T as T-*0. 

Let a particle position measurement be realized by means of a beam of 
sounding particles (for instance, electrons or photons). Let us take the beam to 
be directed normally to the first axis along which the momentum component is 
to be measured. One can expect that in this case the sounding particle beam 
does not influence the value of momentum component p\ on the average. This 
means that the mean value <pi) q of the momentum component at the point q 
does not depend on the sounding particle energy. The formal criterion of this is 
the existence of a limit 

The problem of the existence of the limit (52) can be investigated by means of a 
quantum mechanics formalism. I shall not do this, but shall confine myself only 
to the optimistic supposition that such a limit exists. The existence of the limit 
(52) for all points q and a certain ensemble of free particles, described by the 
wave function t//, means the possibility of measuring the mean value (pi), of the 
momentum component at the point q at the moment t and to attribute it to the 
state descibed by the wave function t/t. 

Using proper sounding beams one can measure mean values of other 
momentum components at the point q. A measurement possibility of the mean 
value <pi) q of the momentum at the point q in the state ip means a measurement 
possibility of mean angular momentum <[q x p]) q and other mean values, which 
are linear over the momentum components of a free particle. 

Measuring electron position one is able to calculate a distribution \4>(q)\ 
over coordinates for all moments of time. This permits the calculation of all 
moments (q l ) (I = 1, 2, . . .) of the electron coordinates and their time depen- 
dence (for simplicity the one-dimensional case is considered). Using 
Schrodinger's equation for the free particle one can show that the moments (p ) 
(/ = 1, 2, . . .) of momentum are expressed by the relations 

</>'> = 17 ^r<<A '=1,2,... (53) 

Rylov 129 

(/ = 1, 2, . . .) and establish momentum distribution for all moments of time. As 
long as free-particle energy is expressed through particle momentum, then the 
formula (53) permits the calculation of the energy distribution for each 
moment of time. All these distributions can be attributed to a definite moment 
of time and consequently to a definite wave function. 

However the essential problem consists in whether or not the foregoing 
procedure is a measurement of momentum distribution. Although to a degree 
this is a question of terminology it is usually taken that momentum distribution 
measurement is a procedure such that a certain value of momentum is obtained 
as a result Of each single measurement. A set of all measured values on 
momentum constitutes a moment distribution. The described procedure is not 
one of such a kind. Here the result of a single measurement is a certain value of 
the coordinate. For this reason I shall not take this procedure as a momentum 

Let us consider the angular momentum measurement in the experiment of 
Gerlach and Stern (1924). The detailed analysis of this experiment can be 
found in any textbook on quantum mechanics (see, for instance, Pauli, 
1933; Blokhintsev, 1963; Bohm, 1965). I shall confine myself to only the 
analysis of to what extent this experiment proves that an angular momentum 
projection upon a certain axis takes the ft-fold values. Let there be an atom with 
an electron shell with a non-zero momentum M resulting from the orbital 
motion of the electrons. The electron spins are supposed to be compensated. 
The magnetic moment (i is connected with the angular momentum M by means 
of the relation 

#* = 




where m is the mass of a particle. As long as moments (q ) can be measured at 
all moments of time then in principle one can calculate all momenta (p ) 

where e is the electron charge and m is the electron mass. 

A beam of such atoms passes between the poles of a magnet with a very 
inhomogeneous field, with field magnitude gradient directed along the 
magnetic field. In the magnetic field the atom obtains an additional energy and 
is affected by the force 

F=-V(|iH) (55) 

Passing through the magnetic field the atoms move normally to the lines of 
force. Under the action of the force (55) during motion through the magnetic 
fields, the atoms obtain a momentum in the direction of the force F, i.e. in the 
direction of the magnetic field. This momentum is different for different values 
of magnetic moment projection [i H upon magnetic field direction. As a result 
the atom beam splits up into several beams depending on the magnetic moment 
projection fi H . After some time the beams are separated in space. Each beam 
can be recorded by its dropping into a definite place on the screen. If the place 
of dropping is known, then one can calculate the corresponding value of the 
magnetic moment projection fi H and the value M H of the angular momentum 
projection upon the magnetic field direction. Experiments show that measured 

130 Uncertainty Principle and Foundations of Quantum Mechanics 

in such a way the value of the angular momentum projection is ft-fold. Such is 
the conventional interpretation of the Stern-Gerlach experiment. The 
discreteness and the multiplicity of h of the angular momentum projection M H 
are explained by those of eigenvalues of the angular momentum operator 
(Bohm, 1965, Chap. 14). 

On the other hand, if the stationary states of the atom are discrete and each 
state obtains an additional energy 

AE, = A£,(H) 


in the magnetic field, then independent of the nature of this energy change of 
the stationary state the atom, placed in the inhomogeneous magnetic field, in 
the ith state is affected by the force 

F, = -V(AE,) 


If the force is different for unlike discrete stationary states, then under its action 
the atom beam is split into several beams according to different stationary 
states. Thus, splitting into discrete beams is connected with discreteness of the 
atomic states and the difference of their energies in the external magnetic field. 
Strictly speaking, it is the energy of the atom in the external magnetic field, that 
is measured in the Stern-Gerlach experiment. This was noted by Pauli (1933) 
and Blokhintsev (1968). The discreteness of the angular momentum projection 
M H results from the fact that the operator M H commutes with the Hamiltonian 
and, hence, its eigenvalues can serve as a label of stationary states. 

One can see from analysis of the atom beam motion in a inhomogeneous 
electric field that it is the second interpretation that is correct. If the atom is 
placed in an electric field E then the energy of its stationary state changes a 
little. This is the so called Stark-effect. For simplicity let us consider hydrogen. 
It is known (see, for instance, Landau and Lif shits, 1963, section 77) that for 
not too large an electric field the change of the stationary state energy AE is 
linear with the electric field E. As a first approximation of perturbation theory 
one gets for the fth undisturbed state 

AE, = [D1-E (58) 

where [D], means some quantity which depends on the undisturbed Hamilto- 
nian eigenvectors and electric dipole operator D, but not on electric field E. For 
some stationary states of hydrogen the [D], is non-zero. The hydrogen atom 
beam, moving in the inhomogeneous electric field E, is affected by the force 

F = -V([D]E) (59) 

which is different for atoms in different stationary states. As a result of the 
motion in a proper electric field the beam is split up into a few discrete beams. 
From a measurement of the beam deflection one can calculate values of [D],E 
which take a set of discrete values. At the same time the electric dipole operator 
D has the form 

D = e(q p -qe) (60) 


Rylov 131 

where q p and q e are position operators of a proton and a electron respectively. 
Components of operator D commute with each other and have a continuous 
spectrum of eigenvalues. 

Thus the interaction operator (-/*H or -DE) responsible for beam splitting 
has a discrete spectrum in one case and a continuous one in another case. But 
splitting into discrete beams is produced in both cases. This means that the 
discreteness of the beams is produced not by the spectrum discreteness of the 
interaction operator, but by that of the whole Hamiltonian, i.e. by discreteness 
of the stationary states of the atom. This means that the Stern-Gerlach 
experiment cannot be used for testing the quantum mechanics statement which 
asserts that by measuring the quantity M H , the operator of which M H has a 
discrete spectrum, only those values can be obtained which are equal to the 
eigenvalues of operator M H . Thus, in the Stern-Gerlach experiment a sorting 
of stationary states is produced. It can be considered as the measurement of 
angular momentum provided values M H are labels of the states. 

Let us analyse to what extent the measured result can be attributed to a 
definite wave function. Suppose in the Stern-Gerlach experiment that the wave 
packet describing the atom ensemble moves in the positive direction of the 
x-axis, as is shown in Figure 1. First of all the wave-packet spread in the 

Atom beam 


Figure 1 Stern-Gerlach experiment with the wave packet moving in the positive 
direction of the a: -axis. 

x -direction has to be essentially more than the size of the apparatus, otherwise 
during the time T of the wave packet passing through its spread, L x , the phase 
difference (42) in the exponent of formula (49) will change within limits 

A = A£r ^ P* A P* T ~ P* A P* ML * = L * A P* - i 
9 h Mh ~ Mh p x h 


where p x is an atom momentum x -component, i.e. the wave function has time 
to change during the experiment which cannot be shorter than T. 

132 Uncertainty Principle and Foundations of Quantum Mechanics 

Let the wave packet size L x be much more than the apparatus size l x , and the 
momentum uncertainty of the atom Ap x along the x-axis be small. Let L z be the 
wave-packet spread in the direction of the 2 -axis. During passage between the 
poles of the magnet the atom acquires a momentum 

Pz = V-H- 





is the interaction time of the atom with the magnetic field, i.e. transit time. For 
beam separation it is necessary that 

p z >Ap z >— 



where Ap 2 is the uncertainty of the momentum of the atom along the z-axis. 
Atoms with different magnetic moment projection have different energies. 
This conditions the change of phase difference of the different Fourier- 
components of the wave function which correspond to different values of fi H . 
At best during the time T the phase difference change is 

A(p = 



1 dH T ^ 
-Hh—L z T 

n dz 

Substituting T from (62) and using (63), one gets 




This means that during the measurement time the wave function changes 
essentially and the measurement result cannot be attributed to a definite wave 

Thus, it follows from the analysis that quantities such as momentum, energy 
and angular momentum cannot be measured in the sense that measured values 
cannot be attributed to definite ensemble states. This means that quantum 
mechanics statements cannot be verified experimentally for all physical quan- 
tities R, because for this test it is necessary to attribute the measured value to a 
definite state. Statistical statements (8) can be verified for R=F(q), and 
perhaps for R=p, but there are those that cannot be verified, for instance, for 

Thus the statistical statements of quantum mechanics can be verified only in 
particular cases and cannot be verified in general. This is the corollary of the 
time-energy uncertainty principle. In principle, this is connected with the 
non-relativistic character of quantum mechanics, according to which a wave 
function is given at one moment of time. This is in contradiction to the 
time-energy uncertainty principle which requires that an esemble description 
is 'spread over time'. Indeed, in quantum mechanics the particle description by 

Rylov 133 

means of a wave function makes it 'spread over space', while relativistic 
symmetry requires it 'to spread it more over time'. 

If statistical statements of quantum mechanics cannot be tested experimen- 
tally then their consequences remain doubtful: for instance, von Neumann's 
theorem on hidden variables or the statement that in the measurement of a 
quantity R only values equal to eigenvalues of the operator R can be obtained. 
For instance, as we have seen, the Stern-Gerlach experiment does not prove at 
all that angular momentum takes only ft-fold values. At the same time some 
consequences of formula (8) may be correct, even if it is not always valid. At 
least it is possible to consider formula (8) as correct because it has not been 
proved that it is incorrect, but only that it can not be proved experimentally. 

At the same time I believe an alternative conception would be welcome 
which would explain the impossibility of making measurements of quantities 
having energy-momentum character and which would, in general, consider as 
observable only those quantities which could be measured experimentally. 


Let us try to have a look at quantum mechanics from another viewpoint. Let us 
imagine that the motion of a microscopic particle is not deterministic (for 
instance, because of its interaction with the surroundings), i.e. its behaviour is 
like a Brownian particle. This means that the particle's world-line appears to be 
strained in a random way. Let us suppose that a statistical description of such 
non-deterministic world-lines can lead to the same results as those obtained by 
quantum mechanics. Of course, it is hopeless to try to obtain all basic 
statements (l)-(5) of quantum mechanics because on account of the von 
Neumann theorem on hidden variables they are certainly inconsistent with the 
supposition that a particle is described by means of a definite world-line (see, 
Moyal, 1949). As we have seen, not all statements of quantum mechanics can 
be tested experimentally. For this reason some hope remains for the successful 
formulation of world-line statistics in such a way that disagreement with 
statements (l)-(5) will occur only within an unobservable field. 

It should be noted that attempts to interpret quantum mechanics from a 
classical or quasi-classical point of view are numerous. They are known as 
hidden-variables theories. A review of different versions of such theories and 
their bibliography can be found in the survey by Kaliski (1970) and in a 
monograph by Belinfante (1973). In most cases such theories represent 
attempts to interpret the basic statements of quantum mechanics from different 
viewpoints. Attempts to obtain the results of quantum mechanics starting from 
classical statistics in their pure form have not succeeded as a rule, because 
non-relativistic statistics have been used. This is motivated by the fact that 
quantum mechanics is also non-relativistic. 

As far as possible I shall confine myself consequently to the relativistic 
viewpoint. First of all it is necessary to differentiate between non-relativistic 

134 Uncertainty Principle and Foundations of Quantum Mechanics 

and relativistic notions of state. The non-relativistic state (n-state) of a system is 
a set of quantities given at a certain moment of time. For instance, the particle 
n-state is determined at a certain moment by coordinates q and momenta p, i.e. 
by a point in the phase space of coordinates and momenta. In non-relativistic 
physics the division of physical phenomena descriptions into state and equa- 
tions of motion, is connected with the existence of absolute simultaneity and 
the existence of an invariant division of space-time into space and time (two 
invariants: time period and distance). Correspondingly, a particle is considered 
as a point in the three-dimensional space. 

The relativistic state (r-state) is given over all space-time. For instance, the 
particle r-state is its world-line, described by the equation q=q'(T) 
(i = 0, 1, 2, 3). t is a parameter along the world-line. Equations of motion play 
the part of restrictions imposed upon possible r-states. In relativistic physics the 
united description (without division into state and equations of motion) is 
connected with the absence of an invariant division of space-time into space 
and time (the only invariant: interval of space-time). Correspondingly, a 
particle is considered as a one-dimensional line in space-time (but not a point 
in space). 

In non-relativistic physics a statistical method is used for the descriptions of 
the non-deterministic system (i.e. systems with uncertainty in equations of 
motion or systems with uncertainty in initial state). One considers a statistical 
ensemble, i.e. a set of many identical systems which are in different states. The 
dynamical systems constituting the ensemble are known as elements of the 
ensemble. The statistical ensemble is a deterministic dynamical system even if 
constituting systems are non-deterministic. This means that the ensemble 
n-state can be calculated at time t if its n-state at time / (fo< is determined. 
For example, let a dynamical system A consist of a non-deterministic particle. 
The ensemble consists of N such independent particles (N-»oo). The state of 
every particle is represented by a point in the phase space. Let dft be an 
element of volume of phase space, and let dN be the number of points in dft. 



where W = W(q, p) is a state density. 

Although any individual system of an ensemble is indeterministic, it is found 
that the evolution of an esemble n-state W can be calculated because the W 
obeys some equation the form of which depends on the character of random 
forces, acting upon particles of the ensemble. In the case, when systems A 
constituting an ensemble are deterministic, the form of the equation which is 
obeyed by W is determined uniquely by means of the equation of motion of 
system A. Thus, the W is the n-state of the statistical ensemble as a dynamical 

The equation which is obeyed by W is invariant with respect to the 
transformation W-> CW when C is a constant. This is so because the ensemble 
behaviour does not depend on the number of systems constituting the ensem- 

Rylov 135 

ble, if this number is large enough. The constant C can be chosen in such a way 
that W(q, p) represents the probability of detecting the n-state of a particle in 
the volume dft of the phase space. 

That the ensemble n-state is a probability density is connected with the 
representation of the state of the system by a point (but not a line or surface) in 
the phase space. As W(q, p) is the probability density of the detection of the 
physical system in the state (q, p), then calculating W(q, p) at any moment t by 
means of the equation of motion of the statistical ensemble, one can calculate 
the evolution of the mean value (F(q, p)> of any function F of the n-state (q, p). 

I have described in general the conventional scheme of the statistical 
ensemble application for describing the behaviour of the non-deterministic 
systems. Three essential points in this scheme should be stressed. 

(1) The transition from a physical system to the ensemble, i.e. the method of 

construction of the ensemble state. 

(2) The determination of equations which are obeyed by the ensemble state, 

and the solution of these equations. 

(3) The transition from an ensemble state to an individual system, i.e. the 

method of calculation of the statistical characteristics of the non- 
deterministic system proceeding from an ensemble state. 

In the generalization of the statistical method to the relativistic case different 
variants are possible. I shall consider three of them. 

(1) In the first variant the base statement asserts that in the relativistic case 

the ensemble state is represented by a probability density. For this it is 
necessary that the ensemble state is represented by a point in a phase 
space. To reach this the particle description by means of the r-state 
(world-line) is dropped and one considers the intersection points of the 
world-line with different three-dimensional surfaces or points marked 
by parameters such as proper time (Hakim, 1967a, b, 1968). Such 
points together with momenta represent a state of a particle, as in 
non-relativistic physics. Requirements of relativity are taken into 
account imposing Lorentz-invariance conditions upon corresponding 
equations. Such an approach is most widely accepted, but I shall not use 
it. I believe that using the world-line as the main subject of a theory is far 
more important than the application of the developed formalism of 
probability theory. 

(2) The second method consists in considering that any particle r-state 

(world-line) is a point in a certain functional phase space M, the r-state 
being kept as the basic subject of the theory and the ensemble state 
being a probability density on M. However, if I is an intercept of the 
world-line located within a region ft of space-time, then the question of 
which region on M corresponds to ft will be solved depending on the 
behaviour of the world-line outside ft. This means non-locality of 
description. Such a description seems to me unsatisfactory. 

136 Uncertainty Principle and Foundations of Quantum Mechanics 

(3) The method, which I shall use, keeps the r-state as a basic subject of 
theory. It realizes a local description, but the ensemble state, i.e. the 
density state of systems constituting the ensemble, cannot be treated as 
a probability density. For instance, let there be an ensemble of systems 
consisting of one particle, i.e. an ensemble of world-lines. Let ds, 
(i = 0, 1, 2, 3) be an infinitesimal area at the point q and AN be the 
number of world-lines crossing ds*. Then 




where / is a factor which is, by definition, the density of r-states 
(world-lines) at the point q of space-time. The/' considered at a certain 
moment of time is the ensemble n-state. The same /' considered in the 
whole space-time (or in some region) is the r-state. In this sense the 
r-state of the ensemble coincides with the n-state. 

In the case, when the ensemble elements are dynamical systems consisting of 
N particles, the r-state of such a system described by an N-dimensional surface 
on 4JV-dimensional space Vi 2 ..jv = Vj <8> V 2 ® • • • ® Vjv which is a tensor pro- 
duct of spaces Vj (i = 1, 2, . . . N) for each of the particles (see, for instance, 
Hakim, 1967a, b). The state density of such systems is described by the 
antisymmetrical over all indices pseudotensor j" 1 " 2 "" N on Vi2..jv 
(fli, a 2 , ,,,a N = \,2, ...4N). 

Later on for simplicity I shall confine myself to the case, when the ensemble 
element is a dynamical system consisting of one particle. The method of 
transition from the system r-state to the statistical ensemble is defined. It 
coincides with the conventional method: the ensemble state is the state density 
of systems constituting the ensemble. The reverse transition from the statistical 
ensemble to the properties of the system cannot coincide with the conventional 
one, because the conventional method is based on the fact that an ensemble 
state W(q, p) is the probability density of detecting a system in the state (q, p). 
Strictly speaking, neither /' nor j" 1 "* " '" N can be treated in such a way. For this 
reason the transition from the statistical ensemble to the non-deterministic 
system is based on the use of additive quantities. 

Definition. The quantity B is an additive one if the value of B for several 
independent systems is equal to the sum of values of B for every system. 

Energy, momentum, angular momentum and their densities are examples of 
additive quantities. The statistical ensemble is a set of independent systems. 
For this reason any additive quantity attributed to the statistical ensemble as a 
dynamical system is a sum of values of this quantity for all systems constituting 
the ensemble. As the ensemble behaviour does not depend on the number of 
systems in the ensemble, the equations for the ensemble state / are invariant 
with respect to transformation / -*■ Cj, where C is a constant and / denotes any 
ensemble state: W, j' . . . . Hence, /' can be normalized on one system. In this 

Rylov 137 

case a value of any additive quantity of the ensemble is equal to the mean value 
over the ensemble of this quantity for systems constituting an ensemble. In the 
non-relativistic approximation, when one of the components of j is non- 
negative and conserved (for instance,/ for a one-particle system), it is possible 
to treat this component as a corresponding probability density. In this case it is 
possible to obtain additional information. 
Let us formulate the above in the axiomatic form. 

The statistical principle 

A deterministic dynamical system, which is called a statistical ensemble 
corresponds to a non-deterministic* dynamical system A whose state is 
described by quantities £ 

(1) A state / of the statistical ensemble is a density state of systems A. 

(2) The equations for the ensemble state j are invariant with respect to 


;' -> Cj, C = constant (68) 

(3) If the ensemble state / has a proper normalization (on one system), every 

additive quantity B attributed to the statistical ensemble as a dynamical 
system is the mean value of quantity B for system A. 

The statistical principle fits either the relativistic or non-relativistic case. It 
settles the question about the determination of the ensemble state and about 
the determination of the non-deterministic system properties, but it does not 
determine which equations are obeyed by the ensemble state. However if the 
ensemble elements are deterministic systems, then the equation satisfied by an 
ensemble state ; is determined by equations which a single ensemble element 
obeys. This fact can be used to simplify derivation of equations which the 
ensemble state / obeys. 

I shall show this by a simple example of an ensemble E the elements of which 
are free particles of mass m, i.e. all possible timelike straight lines in space- 
time. Although it is possible to introduce a density state for such an ensemble 
by means of (67), it is still not possible to describe the ensemble completely. 
This means that if the state ;"' is given at any moment t, then, in general, the /" 
cannot be determined at another moment. The state of such an ensemble can 
be described by a distribution function f(q,p), q = {q ,<l r ,<l 2 ,<l\ P = 
iPo, Pi, P2. P3} which satisfies equation (69) 

T-r(p'f(q,p)) = 


where p l is 4-momentum of a particle. Here and later on summation is made on 

*The statistical principle can be applied to the deterministic system, if its initial conditions are not 
exactly determined. 

138 Uncertainty Principle and Foundations of Quantum Mechanics 

like arabic super- and subscripts from zero to three. The distribution function 
f(q, p) vanishes except over the surface 


ik 2 2 

Pig Pk=m c 

where g ,k is the metric tensor 

c 2 



g = 






0-1 o 

amd m is the mass of a particle. The meaning of the distribution function is 
determined by the fact that the stream density;" of world-lines at the point q is 

/' = }(p' 

l /m)f(q,p)d 4 p, d 4 p = dp dp 1 dp 2 dp 3 


However, instead of the description by means of a distribution function one can 
use the following. Let us consider the ensemble E as consisting of elements E p , 
where E p is an ensemble whose elements are straight lines having the direction 
of the unit vector Pi/mc. The ensembles E p are completely described by the 
pseudovector y '. I shall call such an ensemble a pure one. As long as the E p are 
dynamical systems, they can be elements of the ensemble E. It is reasonable to 
normalize the states y" of the E p in a similar way. 

It is easy to verify that the equations which the j'(q) satisfy have a form 

dq ^j gskj ' 

= 0,1,2,3 

Let us normalize the y' on one system by means of 

/' ds, = 1 




where 1 is an infinite space-like hypersurface and ds, is an element of the 
hypersurface. All physical quantities attributed to a pure ensemble as a 
dynamical system can be expressed through the state j'. 

In the given case, when all ensembles E p can be labelled by parameters p h the 
state of ensemble E can be described by means of non-negative quantities 
W(p) which represent a probability density of detecting a pure ensemble E p in 
the ensemble E. Finally, here all can be reduced to a distribution function 


However, in the case, when non-deterministic one-particle systems are 

considered the equations which the pure ensemble state satisfy can be more 

complicated than equations (73), and the possibility of labelling all their 

solutions by means of the finite number of parameters is not evident. 

For example suppose it is insufficient for the description of a single system to 

give coordinates and momenta, but it is necessary to give quantities q, q, q, . . . 

Rylov 139 

where q = {q°,q l , q 2 , q 3 } and the dot denotes differentiating in proper time t. 
For example, such a situation arises in the consideration of Lorentz-Dirac 
equations. In this case the distibution function f(q,q,q) is used (Hakim, 
1967b). In the non-relativistic case the more general consideration can be 
found in the book by Vlasov (1966). 

In place of the consideration of distribution functions of the type 
f(q, q, q, . . .) one can consider an arbitrary ensemble E as an ensemble in which 
the elements are pure ensembles Ej described by means of /' with y" being a 
function of q only. The dependence of the ensemble E on distributions over 
q,q, . . . manifests itself as the implicit dependence of E on Ej. This has been 
shown in a simple example of the distributive function f(q, p). The considera- 
tion of the ensemble E as consisting of elements Ej has the advantage that, in 
general, one cannot be interested in how many derivatives there are like 
q, q\... and in which way the distribution function depends on them. AH that it 
is necessary to know are the equations which the state of the pure ensemble 
satisfies. The state of an arbitrary ensemble satisfies equations which can be 
obtained merely as a result of a formal transformation of the pure ensemble 
equations. Just this fact explains the mysterious circumstances that the quan- 
tum mechanics pure state (wave function) depends only on coordinates, while 
the usual classical distribution function depends on coordinates and momenta. 
The dependence on momenta (and not only on momenta, but, in general, on 
quantities like q, q, . . .) is taken into account in quantum mechanics in consid- 
eration of mixed states, i.e. ensembles whose elements are pure ensembles. 

In the general case the state of an ensemble E is described by a function Wof 
states y ' of pure ensembles Ej which are elements of the ensemble E. The state 
of each pure ensemble Ej is completely described by four functions y" (i = 
0, 1, 2, 3). This means that y' satisfy certain equations, and that all physical 
quantities are expressed through /'. In the case in which the state W of the 
ensemble E is considered as an r-state, the W should be considered as a 
function of four quantities/' (i = 0, 1, 2, 3) which are considered as functions of 
space coordinates q and time t. As it is necessary to use the relation derived 
using non-relativistic statistics, it is convenient to consider the state W of the 
ensemble E as the n-state, determined at any moment of time t. In this case the 
n-state of ensemble W is a function of four quantities y" 0=0,1,2,3), 
considered as the function of the space coordinates q. Besides the W is a 
function of time t. It is supposed that, if for a pure ensemble Ej the state /" 
(i = 0, 1, 2, 3) is given at moment t, then it can be determined at all subsequent 
moments of time. The dependence of the n-state W[j'] on time can be 
determined from the relation 

W,[j t ] = Wb[/r=o], y = {y°, y 1 , j 2 , j 3 } 


The index 'f ' of W, and j, shows the moment at which these values are taken. 
The relation (75) expresses the fact that the change of form of the functional 
W[j] is conditioned only by a change of functions /' which is described by the 
equations of motion. Let the equations of motion of the n-state of pure 

140 Uncertainty Principle and Foundations of Quantum Mechanics 

ensemble E s have a form 

'^ *-{£•£•£}' / = °' 1 ' 2 ' 3 (?6) 



^''^ a =tvW K ' =0 ' 1 ' 2 ' 3 

G' are some functions of/ and their derivatives. Differentiating the (75) with 
respect to t and using the expressions (76) for dj'/dt, one gets the following 
linear equation with variational derivatives. 


3 r xiy 

i=o J o; W 


Equations of motion (76) of a pure ensemble are characteristics of the linear 
equation (77). 

Let us apply the above consideration to an arbitrary ensemble of non- 
deterministic world-lines. In the paper (Rylov, 1971) in the non-relativistic 
approximation I considered a pure ensemble the elements of which are 
non-deterministic systems, each one consisting of one particle moving in a 
given electromagnetic field. It has been shown that it is possible to choose the 
equation of motion (76) of a pure ensemble in such a way that from statistical 
principles all basic results of one-particle quantum mechanics could be 
deduced. Namely, supposing that j a /j° (a = 1, 2, 3) has a potential and can be 
represented in the form 

■75 = — T-^, « = 1,2,3 
; m dq 

it was shown that the function 

!/f = Vpexp|— j, p=/° 
satisfies the Schrodinger equation 

dt Y 


2m a = i V dq c / c 





A, (i = 0, 1, 2, 3) is the 4-potential of the electromagnetic field, e and m are the 
charge and mass of the particle, respectively. 

With proper normalization of the wave function (79) the rule for calculation 
of the mean values (R) of the quantity R has the form (8) for quantities 
representing momentum, energy, angular momentum and arbitrary functions 
of coordinates. The stationary states of an ensemble are described by a wave 
function which is an eigenfunction of the Hamiltonian (81). 

Rylov 141 

Thus the difference of these results from the basic statements of quantum 
mechanics consists only in that (8) is fulfilled not for all physical quantities, but 
only for certain ones. There are differences just in those points that, as we have 
seen, cannot be tested experimentally. 

Now, using the statistical principle, I am going to generalize the results 
obtained for a pure ensemble of one-particle systems on the case of an arbitrary 
ensemble of one-particle systems. In the case, when restriction (78) is fulfilled 
the /' (i = 0, 1, 2, 3) determined by the wave function (79). In this case the 
n-state W of an arbitrary ensemble E can be considered as a function W[ip] of 
two independent quantities \fi and i/f* (i^* is the complex conjugate of iff). The 
relation (77) takes the form 

dW J_ r f 
dt ih) I 


/#(*)- OW)* 


)dx = 


* denotes complex conjugate, and H is the hamiltonian (81). As W[if/] 
represents the probability density of finding a pure ensemble described by the 
wave function i/r, then the mean value (R ) E of the quantity R over the ensemble 
E can be written in the form 

(R) E = \w[<l>](Rhd[<l>] 


where the integral denotes integration over the whole functional space of the 
values of functions $ and iff*, and (R)^ denotes the mean value of the quantity 
R over the pure ensemble described by the wave function if/. All functions are 
supposed to be normalized by means of the relation 



Let us choose an orthogonal basis {*,} in the Hilbert space of functions i//. Let 
the function <p be decomposed on this basis according to the relations (20) and 
(21), with decomposition coefficients a, = a,[iA] being functional of if/ and i//*. 
Due to (8) and (20) one obtains 

W = 2>fl>KL>]£* 


where R ik is a matrix element of the operator R on the basis {*,} 

Kik = J*?(*)&k(*)cr* 
Substituting (84) into (82) one obtains 

(R) E = lR ik U ki = Sp{RU} 


where the following notation is used 





142 Uncertainty Principle and Foundations of Quantum Mechanics 

and U denotes an operator with matrix elements U ik on the basis \xi}. The 
operator U depends only on the state W[<l>~\ of the ensemble E but does not 
depend on the quantity R. The relation (86) coincides with (3), which describes 
the rule for the calculation of the mean value for an arbitrary ensemble. 

Multiplying (81a) by a j [<A]a*l>] and integrating over the functional space of 
functions i/» one obtains after transformations an equation describing the 
evolution of the matrix elements U ik of state operator U. It has the form 

ih d -E+UH-HtJ = 



and is equivalent to (4). 

Thus, in the non-relativistic approximation one succeeds in showing that for 
an arbitrary ensemble of one-particle systems the basic statements of quantum 
mechanics (l)-(4) can be obtained starting from the statistical principle with 
the restriction that the calculation rule (86) for mean values can only be applied 
for arbitrary functions of coordinates and additive quantities: energy, momen- 
tum and angular momentum. The last restriction arises because the expression 
(84) for the (R)# has been obtained from (8) which is valid in this case only for 
those quantities R. 

Using the statistical principle, the results derived for a pure ensemble of 
two-particle systems (Rylov, 1973a, b) can be generalized to the case of an 
arbitrary ensemble. 

It is interesting to consider how the process of moment measurement 
described in the second section looks from the point of view of relativistic 
statistics (this I call the conception which used the r-state and statistical 
principle). Let us imagine an ensemble of usual Brownian particles (specks of 
dust), spaced in a moving gas. The gas velocity is supposed to be less than the 
thermal velocity of the gas molecules, and the mass of a speck of dust is 
supposed to be of the order of the mass of the molecule. The motion of the 
particles has a character of random wandering. If the gas motion is neglected 
then the mean square displacement of a particle during the time t with respect 
to its initial position has the form 

<(q-qo) 2 ) = 2£>f 


where q is the initial position of a particle, and q is its position at time t. D is a 
diffusion coefficient. Defining the mean value v, during the period t by means of 




one finds that during the time t the mean velocity of the particle will be reduced 
as the period t is increased and it will tend to zero as t -* oo. It is conditioned by 
the fact that the root-mean-square displacement is proportional to Vf, as is seen 
from (89). If the gas motion is taken into account, then the contribution of the 
systematic motion of the gas to the root-mean-square displacement is propor- 
tional to ut, where u is the mean velocity of the gas. This contribution 

Rylov 143 

dominates if t is large enough. As a result one of the gas streams takes the speck 
of dust away. The mean velocity during the time t will be different for different 
specks of dust, but, in general, if t -* oo it will not tend to zero but will tend to 
some value which is equal to the velocity of the gas stream taking the speck of 
dust away. The velocity of each element of gas is supposed to tend to a constant 
as t -> oo. The magnitude of the velocity of the speck of dust averaged over 
period t (with t -* oo) depends only on which gas stream catches that speck. 

Let us estimate a distribution of the dust specks over velocities using (90). If 
the time t is of the order of the mean free time then the velocity distribution will 
be a Maxwellian one with a temperature close to that of the gas. With 
increasing time t the distribution remains Maxwellian, but its temperature 
reduces until the root-mean-square velocity of the specks calculated by means 
of (89) and (90) becomes of the order of the gas speed. With further increasing 
of time t the velocity distribution of the specks is determined mainly by the 
gas-stream velocities. 

The dependence of the velocity distribution on the measurement time t is 
rather like that of an electron in the state described by a wave function. The 
difference is only in the absence of an agent taking the electron away as the gas 
does with the dust specks. In quantum mechanics the statistical ensemble 
(dynamical system!) plays the part of such a carrying agent. The conservation 
laws of energy and momentum are fulfilled for the ensemble. This prevents the 
electron momentum being dissipated and provides a constant motion of the 
electron in a definite direction. Thus, from the viewpoint of relativistic statistics 
the q-momentum measurement represents the measurement of the momen- 
tum of an electron averaged over a long period. 

Relativistic statistics permits one to obtain all the basic statements of 
quantum mechanics with the reservation mentioned above. It is curious that 
this conception does not contain anything typical for quantum physics. It is 
based completely upon principles of classical mechanics and classical statistics. 
Besides the motion of microscopic particles is assumed to be non-deterministic. 
Such an assumption in itself is not specific to quantum mechanics. It occurs 
occasionally in classical statistics, for instance, in the description of Brownian 
particles. The only essential supposition is that about a form of the term which 
is added to the Lagrangian of a pure ensemble of deterministic particles for the 
description of particle indeterminacy. This term has a universal form (it does 
not depend on particle characteristics) with Planck's constant being a 
coefficient before it. This is the only place, where Planck's constant ft is 
introduced into the theory. The introduction of such a term cannot be treated 
as a principle because some supposition about the character of the particle 
motion indeterminacy should be made in any case. 

Being in its form a classical (non-quantum) conception, the relativistic 
statistics means in no case a returning to classical mechanics. It gives a less 
detailed description of a dynamical system than quantum mechanics does and 
far less detail than in the classical case. It is a further step on the way to the 
restriction of detailed description in a microcosm. 

144 Uncertainty Principle and Foundations of Quantum Mechanics 

Being essentially a relativistic theory, the relativistic statistics coincide with 
quantum mechanics only in a non-relativistic approximation. Linearity (equa- 
tion linearity, linear operators, the linear superposition principle), which is 
raised into a principle by quantum mechanics, is one of the main reasons for the 
advance of quantum mechanics. All this is conditioned by the non-relativistic 
character of the approximation. 

It is most curious that the conventional way of joining quantum theory with 
relativity, using wave functions, the linear superposition principle and so on, 
cannot be understood from the viewpoint of relativistic statistics. Accordingly 
to relativistic statistics nothing of that kind should be done. The non-relativistic 
approximation must not be used: the problem should be solved exactly. This 
claim of relativistic statistics does not seem a complete absurdity, if it is taken 
into account that in a consequent application the relativistic statistics contains a 
possibility of the generation of pairs i.e. particle-antiparticle (Rylov, 1975). 
The future will show how firmly the claims of relativistic statistics are founded. 


Aharonov Y. and Bohm D. (1961) 'Time in the quantum theory and the uncertainty relation for 

time and energy', Phys. Rev., 122, 1649-1658. 
Ballentine I. L. (1970) 'The statistical interpretation of quantum mechanics*, Rev. Mod. Phys., 42, 

Belinfante F. J. (1973) A Survey of Hidden -Variables Theories, Pergamon Press, Oxford. 
Blokhintsev D. I. (1963) 'Foundation of quantum mechanics', Moscow, Leningrad (in Russian). 
Blokhintsev D. I. (1968) The Philosophy of Quantum Mechanics, Reidel, Dordrecht, Holland. 

(Original Russian edition 1965). 
Bohm D. (1965) Quantum Theory, Prentice-Hall, Englewood Cliffs, New Jersey. 
Bohr N. (1918) 'On the quantum theory of line spectra', Koniglige Danske Videnskabemes 

Selskabs skrifter Naturvidenskabelig og. mathematisk Aufdelung 8 Raekke, Bd. 4, 1, 1-1 18. 
Bohr N. (1928) 'The quantum postulate and the recent development of atomic theory', Nature, 

121, 580-590. 
Einstein A. (1949) in Albert Einstein: Philosopher-Scientist, P. A. Schlipp, Ed., Library of the 

Living Philosophers, Evanston (reprinted by Harper and Row, New York, p. 665). 
Fock V. and Krylov N. (1947) 'On the uncertainty relation between time and energy', /. Phys. 

(U.S.S.R.), 11, 112-120. 
Fock V. A. (1962) 'On the uncertainty relation between time and energy and one attempt to 

disprove it', /. Experim. Theoret. Phys., 42, 1135-1139 (in Russian). 
Gerlach W. and Stern O. (1924) 'Uber die Richtungsquantelung im Magnetfeld', Ann. Physik, 74, 

Hakim, R. (1967a) 'Remarks on relativistic statistical mechanics. I', /. Math. Phys., 8, 13 15-1344. 
Hakim R. (1967b) 'Remarks on relativistic statistical mechanics. II. Hierarchies for reduced 

densities', /. Math. Phys., 8, 1379-1400. 
Hakim, R. (1968) 'Relativistic stochastic processes', J. Math. Phys., 9, 1805-1818. 
Heisenberg W. (1927) 'Uber den anschaulichen Inhalt der quantentheoretischen Kinematik und 

Mechanik', Z. Physik, 43, 172-198. 
Heisenberg W. (1930) Die Physikalischen Prinzipen der Quantetheorie, Leipzig, Section II, 2d. 
Heisenberg W. (1955) 'Quantum theory and its interpretation', in Niels Bohr and the Development 

of Physics, London. 
Kaliski S. (1970) 'A tentative classical approach to quantum mechanics, Proceedings of Vibration 

Problems, 11, 3-17. 
Landau L. and Peierls R. (1931) 'Erweiterung des Unbestimmtheisprinzips fiir die relativistische 

Quantentheorie', Z. Physik, 69, 56-69. 

Rylov 145 

Landau L. D. and Lifshits E. M. (1963) Quantum mechanics, Moscow, Section 77 (in Russian). 
Mandelstam L. and Tamm Ig. (1945) 'The uncertainty relation between energy and time in 

non-relativistic quantum mechanics,' J. Phys. (U.S.S.R.), 9, 249-254. 
Mandelstam L. I. (1950) 'Lectures on foundation of quantum mechanics (Theory of indirect 

measurements),' in Complete collection of proceedings, 5, 345-415. (in Russian). 
Margenau H. (1963) 'Measurement in quantum mechanics,' Ann. Phys., 23, 469-485. 
Moyal J. E. (1949) 'Quantum mechanics as a statistical theory,' Proc. Cambridge Phil. Soc., 45, 

Neumann J. V. von (1932) 'Mathematische Grundlagen der Quantenmechanik,' Berlin. 
Pauli W. (1933) 'Die allgemeinen Prinzipen der Wellenmechanik,' in H. Geiger and K. Scheel 

Handbuch der Physik, 2nd ed., Vol. 24/1, Springer. Berlin, Chap. 2, pp. 83-272. 
Pearle P. (1967) 'Alternative to the orthodox interpretation of quantum theory', Am. J. Phys., 35, 

Popper K. R. (1959) The Logic of Scientific Discovery, Basic Books, New York. 
Reece G. (1973) 'Theory of measurement in quantum mechanics,' Intern. J. of Theoret. Phys., 7, 

Robertson H. P. (1929) 'The uncertainty principle', Phys. Rev., 34, 163-164. 
Rylov Yu. A. (1971) 'Quantum mechanics as a theory of relativistic Brownian motion,' Ann. 

Physik, 27, 1-11. 
Rylov Yu. A. (1973a) 'Quantum mechanics as relativistic statistics. I: the two-particle case,' Intern. 

J. Theoret. Phys., 8, 65-83. 
Rylov Yu. A. (1973b) 'Quantum mechanics as relativistic statistics. II: the case of two interacting 

particles,' Intern. J. Theoret. Phys., 8, 123-139. 
Rylov Yu. A. (1975) "The problem of particle generation in classical mechanics,' in Investigation of 

Cosmic rays, Moscow (in Russian, pp. 171-177). 
Vlasov A. A. (1966) Statistical Distribution Functions, Moscow (in Russian). 
Wick G. C, Wightman A. S. and Wigner E. P. (1952) 'The intrinsic parity of elementary particles,' 

Phys. Rev., 88, 101-105. 
Wigner E. (1963) "The problem of measurement,' Am. J. Phys., 31, 6-15. 


Uncertainty, Correspondence and 
Quasiclassical Compatibility 


Polish Academy of Sciences, Warsaw 


There was no exaggeration in the famous metaphor of Sommerfeld comparing 
the correspondence principle with a magic wand. In fact, within the framework 
of the Old Quantum Theory this Principle was the only method which enabled 
physicists to evaluate such physical quantities as the intensities of spectral lines 
and the polarizations of atomic radiation. However the very nature and the 
internal logic of the correspondence itself remained unknown. Using the more 
fashionable terms of cybernetics we might say that the Bohr correspondence 
principle has worked like a black box. 

There were many such 'magic wands' in quantum mechanics; some of them 
strongly influenced the development of the theory, especially in its early days. 
There is nothing strange in this: quantum theory aims to describe microscopic 
phenomena which are so far from our everyday experience that when analysing 
them, all Newtonian intuitions break down. Therefore, from the point of view 
of philosophy based on Newtonian mechanics, all the famous quantum post- 
ulates were incomprehensible and could only be justified at the stage of final 
results. As a matter of fact, the whole of the old quantum theory consisted of 
mysterious magic wands. For example the theoretical status of the Bohr- 
Sommerfeld quantum conditions was completely inconceivable. The same is 
true of the Planck-Einstein postulates about the quantum granular structure of 
electromagnetic radiation. Similarly, the material waves as postulated by de 
Broglie, were more a mathematical idea than a physical picture in a classical 

From the heuristic point of view two so-called 'principles' were of special 
importance namely the afore-mentioned correspondence principle and the 
Heisenberg uncertainty principle. Some historical comments are necessary 
here. The correspondence principle was formulated in the days of the old 
quantum theory. When formulating and developing the main ideas of quantum 
mechanics, Heisenberg, Born and Jordan referred explicitly to it (Heisenberg, 
1925; Born and Jordan, 1925; Born, Heisenberg and Jordan, 1926). On the 


148 Uncertainty Principle and Foundations ol Quantum Mechanics 

other hand, the uncertainty principle was derived by Heisenberg later in 1927, 
two years after he had discovered matrix mechanics (Heisenberg, 1927). Hence 
it could be said that the uncertainty idea being a consequence of fundamental 
quantum rules, has not played any heuristic role in the development of 
quantum mechanics. However this is not the case. The uncertainty principle is 
only a mathematical/quantitative comment on the qualitative postulate which 
enabled Heisenberg to discover the quantum rules (Heisenberg, 1925). In the 
following, I shall call it the uncertainty postulate or 'postulate of the phase 

This postulate and the correspondence principle were just the magic wands 
which enabled Heisenberg to create the matrix mechanics— the first correct 
formulation of quantum theory. 

Now let us consider these postulates with special emphasis to their qualita- 
tive content and philosophical assumptions. 

(A) The Correspondence Principle 

According to this principle classical laws are asymptotic/approximate expan- 
sions of the quantum ones. There are two meanings of the classical limit: (a) the 
formal one based on the transition ft -*0 and (b) the physical asymptotics of 
large quantum numbers. It depends on the kind of problem, which of these two 
asymptotics is to be used. The oldest formulation of the correspondence 
principle, which is due to Bohr, concerned only the theory of atomic radiation. 
Bohr formulated some methodological guiding hints which enabled him to get 
some qualitative and even quantitative results concerning intensities and 
polarizations of spectral lines. The selection rules, for example, have been 
formulated in such a way. The semi-classical theory of atomic spectra based on 
the correspondence principle was a hybrid of the classical theory of elec- 
tromagnetic radiation and the Bohr-Sommerf eld quantum conditions imposed 
upon a classical multiply-periodic system. Hence it joined in a mysterious 
magic way two contradictory pictures of radiation: the Bohr-Sommerf eld 
quantum jumps and the classical continuous Hertz-Maxwell radiation. 

The very idea of the correspondence principle was based on the philosophi- 
cal belief in the existence of some rigorous quantum theory of atomic 
phenomena which had remained undiscovered up to 1925. This belief moti- 
vated all efforts at guessing this unknown theory by an appropriate reformulat- 
ing of its asymptotic form, i.e. classical mechanics and electrodynamics (Born, 

(B) The Uncertainty Postulate 

This postulate was the basic idea of the epoch-making paper of Heisenberg 
(1925). According to it, microscopic phenomena are essentially non-local in 
both configuration and phase space. Concepts such as a trajectory and a 

Siawianowski 149 

hodograph of an electron become essentially inadequate within the framework 
of atomic phenomena. They are incompatible with the Heisenberg matrix 
representation of physical quantities. Therefore, the resulting quantum theory 
breaks with classical ideas even on the elementary level of kinematics. Before 
Heisenberg's discovery it was expected rather that the classical dynamics (the 
equations of motion) was to be replaced by some quantum theory, but nobody 
supposed the classical notions of state, position, etc., to be essentially inade- 

The uncertainty principle derived by Heisenberg in 1927 is a mathematical 
comment to his uncertainty postulate. It describes in quantitative terms the 
phase space non-localness of quantum phenomena. The modern general 
formulation of this principle predicts the relationship between statistical 
dispersions of arbitrary quantities when measured in the same quantum state. 
When AA, AB are dispersions of quantities described by operators A, B on the 
quantum state p then 

AAAB>-\([A,B]) P \ 

In particular for positions and momenta 




Therefore quantum phenomena are non-local in a classical phase space. The 
critical phase volume characteristic for this non-localness is of the order (ft/2)" 
where n is a number of degrees of freedom. 

Within the finished modern framework of the quantum theory the uncer- 
tainty and the correspondence principles seem to be rather secondary results of 
the basic assumptions and automatically following from them in a purely logical 
way. In spite of such views we are going to show that there are still some 
doubtful questions and physical problems of interpretation connected with 
these principles. The correspondence of some classical and quantum concepts 
is a delicate matter. This concerns for example pure states. It appears that the 
purely logical formal approach based on the ft.-* asymptotics is not sufficient. 
Our analysis leads to interesting physical consequences concerning the rela- 
tionship between concepts of information and symmetry on the classical and 
the quantum level (SJawianowski, 1973). We start with the derivation of the 
Weyl-Wigner-Moyal phase-space formulation of quantum mechanics directly 
from the uncertainty postulate. 


(if Heisenberg Had Started with Statistical Mechanics) 

The uncertainty principle does not preclude us from describing quantum 
phenomena in terms of classical phase space. It implies only that such a 
description shall be essentially non-local. 

150 Uncertainty Principle and Foundations of Quantum Mechanics 

The phase-space formulation of quantum mechanics is due to Weyl (1928, 
1931), Wigner (1932) and Moyal (1949). In particular, Wigner functions 
describing quantum statistical ensembles have long been used in physical 
chemistry. The phase-space methods as developed by Wigner seem to be a 
secondary accidental consequence of the usual Hilbert space approach. Weyl 
and Moyal developed more systematic formulations in which the classical 
phase space and its geometry played an essential role prior to the Hilbert space 
techniques. Nevertheless the basic ideas, motives and techniques of Moyal and 
those of Weyl were completely different. Moyal aimed at replacing an abstract 
geometry of Hilbert spaces by more familar statistical methods in a classical 
phase space. On the other hand the group-theoretical Weyl approach is based 
completely on the mathematical a priori. 

The formulation we present here joins and unifies the approaches of Weyl 
and Moyal. Starting with the uncertainty principle as a primary idea, we suggest 
reformulating the classical statistical mechanics so as to turn it into non- 
local theory. It appears that the Weyl-Wigner-Moyal formulation of quantum 
mechanics is the most natural result. To get it, we will appeal to some 
group-theoretical results of Bargmann (1954). Let us note that starting with the 
uncertainty relations as a primary basis for the 'deductive' construction of 
quantum mechanics is historically justified and free from tautology. In fact the 
famous derivation of this principle based on the idea of the Heisenberg 
microscope appealed explicitly to the semi-classical model of interaction 
between electromagnetic field and electrons. Hence, it was certainly possible to 
achieve this result before 1925, when the matrix mechanics was formulated. 

Unfortunately, the Weyl-Wigner-Moyal approach is applicable only to 
systems with affine symmetry of degrees of freedom: 

An affine phase space is a triplet (F, II, T) where: 

(1). (F, II) is an affine space; F is its underlying set, i.e. a manifold of classical 
states and II is a linear space of translations (free vectors) on P. 

(2). T is a covariant skew-symmetric and non-degenerate tensor of the 
second order on II: T e A II*. 

The only translation carrying a e P over onto beP will be denoted as ab e II. 
We will use only affine coordinates on F: when (a ; . . . e t . . .) is an affine frame, 
i.e. a eP and {«,-} is a linear basis hi II, then the corresponding affine coordi- 
nates of b e P are components of ab with respect to the basis {c,} 

a6 = ?(.b)e t 

The reciprocal contravariant tensor of T will be denoted as f . Raising and 
lowering of indices is to be understood in the sense of skew-symmetric 'metric' 
T. Instead of f ab , we will write shortly r afc this does not lead to misunderstand- 


IT,* = St 

r gives rise to the skew-symmetric scalar product on II and to the Poisson 
bracket operation. 

Stawianowski 151 

The Poisson bracket of smooth functions F, G on P is given by 

{F,G} = <dF®dG,f) = r*^^ (3) 

where tj a are affine coordinates on F. 

The algebraic non-singularity of T implies that dim F = 2n, where n is a 
natural number, the so-called number of degrees of freedom. 

One can choose the affine frame in F in such a way that 




where / is an n x n -identity matrix (Kronecker matrix), is a matrix with all 
elements vanishing and Y ab are components of T with respect to the choosen 
frame. Denoting the corresponding affine coordinates as (. . . q ....... p, ,. . .) 

we find 

r „ „, dF dG dF dG 

{F,G} = — — t-— j — 

dpi dq dq dpi 


Such coordinates are called canonical ones. 

Affine structure gives rise to the translationally-invariant Lebesgue measure 
on F. It is unique up to normalization. One can normalize it in such a way that 
the integration consists in 

/-*J/dn= J/d? 1 . . . dq" dp, . . . dp n 


for any system of canonical coordinates. The above definition is correct 
because it does not depend on the particular choice of canonical coordinates. 

Such a normalization is inconvenient from the point of view of statistical 
mechanics, because the measure has a physical dimension of the nth power of 
action. So there is no unique definition of the entropy of statistical ensembles 
until some unit of action A is chosen. Such a choice enables us to get rid of the 
physical dimension in the volume measure on F. Namely we put this measure 
fix as follows 

\f(p) dMA (p) = A "J/ dq 1 . . . dq n d Pl . . . dp n 


where (q . . . p„) are arbitrary canonical coordinates on F. 

The entropy S k (p) of the probabilistic measure on F, being absolutely 
continuous with respect to p. A and having density p, is given as 

S\(p)= IplnpdAix 


Roughly speaking fi^(U) is the number of A" cells contained in the domain 

152 Uncertainy Principle and Foundations of Quantum Mechanics 

According to the Planck theory of black-body radiation we have at our 
disposal the quantum of action which is at the same time the universal constant 
of nature 

h = 6.54 x 10~ 27 erg sec = 2ttH 

Hence, we could put A = h. This is just the first place where physical quantum 
notions taken from experimental analysis are introduced into the geometrical 
framework of the classical phase space. To retain the correspondence with some 
commonly used formulas we will rather put A = h/2. Obviously, this does not 
alter any essential matter; such a choice only simplifies our notation. In the 
following we will often write p., S simply instead of p. h/2 , S h/2 . Hence our 
measure is given as follows: 

f / dp = (2//i)" j/ Aqi ■ ■ • V d/h • • • dp„ 


Besides the above entropy argument there is one more reason justifying (7) 
with A = h/2. In fact, according to the uncertainty relations any essentially 
2n -dimensional physical situation in P must be smeared out onto a region in P 
the n volume of which exceeds (h/2) n . States of quantum systems are related to 
phase-space regions the (i h /2 volumes of which are of order one at least. 
Therefore roughly speaking, p/,/ 2 (t/) gives an account of the number of 
quantum states within the phase-space region f/cP. Similarly, one can 
reasonably expect that quantum statistics attaches well-defined probabilities to 
those subsets only which are large when compared with critical h" cells (which 
'contain' many quantum states). The uncertainty Principle suggests that we 
reformulate the classical statistical mechanics so as to turn it into non-local 

Let us start with the basic notions of classical statistics in P: Physical 
quantities i.e. random variables are described by real analytic functions on P. 

Statistical ensembles are described by probabilistic measures on P. We are 
especially interested in measures absolutely continuous with respect to p h / 2 ; 
they are described by positive statistical densities normalized in such a way that 

J P dph/2 


p* = p 

pA*Ad/t>0 for arbitrary A 


In the following even when talking about measures concentrated on subsets of 
H -measure zero we will describe them shortly by 'densities' p keeping only in 
mind that they are then distributions rather than usual functions. 

Stawianowski 153 

The operational statistical interpretation of the above notions is based on the 
following concepts: 

(1). Expectation value of a physical quantity A in the ensemble p: 

(A) p = jA(p)p(p) dp h/2 (p) 


(2). The non-normalized probability of detecting a system in a statistical state 
p 2 when it is known to be in a state pi. 

P(Pi, P2) = J Pi(p)P2(p) dfi h/2 (p) 


(3). 'Proper -ensemble' of physical quantity A with an eigenvalue aeA(P). 
Measurements of A on this ensemble give the result a with certainty; 
there is no statistical spread. Therefore, A becomes constant and equal 
to a when restricted to the support of p: 


A |Supp p = a 
This is equivalent to the eigenequation: 

Ap = ap 
When A is a non-constant analytic function, then: 

p = FS(A-a) 



where F is non-negative on Supp p. Hence, 'proper ensembles' of A 
with an eigenvalue a are measures concentrated on the a -value surface 
of A: 

M (A , a) = {peP:A(p) = a} (16) 

When F=\, (15) becomes an (A, a)-microcanonical ensemble, i.e. 
statistical distribution 8(A-a). 

The only essentially local notion used in (1), (2) and (3) is the usual, pointwise 
product of functions (distributions). This is especially apparent in (3), i.e. in the 
classical spectral analysis of statistical ensembles and physical quantities. The 
lack of statistical dispersion of A measurements performed on the ensemble p 
is evidently a physical notion, prior to any particular mathematical model. It is 
only the structure of the associative function-algebra over P under the point- 
wise product, that is responsible for the essentially local relationship between 
proper ensembles of A and value surfaces of A. Therefore, the most natural 
way to achieve the non-local description of measurements in P compatible 
with the uncertainty principle is to replace the afore-mentioned associative 
algebra by some other function-algebra based on the non-localized product 

(A ±B)(p) = J3iT(p :p 1 ,P2)A( Pl )B(p 2 ) dpOh) dp(p 2 ) (17) 

154 Uncertainty Principle and Foundations of Quantum Mechanics 

Such a product gives rise to the non-local statistical theory in P: one should 
only replace the pointwise product in (10), (11) and (12) and especially (14) by 
the above product (17). Now, we will find the appropriate form of 5if, starting 
with some natural postulates: 
We assume 1 to be translationally invariant: 

(A±B), = A W 1B, (18) 

for arbitrary functions A, B and 7reU;C„ denotes the function obtained by a 
7r-translation of C: 

C A (q) = C(p) (19) 

Translational invariance implies 

%(P\P1,P2) = K(PPUPP2) 

for some function K. 
We assume the non-local product (17) to be associative: 

(AlB)lC = Al(BlO 

This results in the-functional equation for K 

[k{x u x)K(x 2 -x, x 3 -x) dx = j K(xi-x, x 2 -x)K(x, x 3 ) dx 




(Stawianowski, 1974) where dx is a translationally-invariant Lebesgue meas- 
ure on II. The convolution-like structure of (22) suggests that we search for its 
solution in the Fourier representation. Our functional equation becomes then 
purely algebraic: 

K(tu €2 + &)*(&, &) = *(f i> 6)*(£i + f 2, 6) (23) 

(Stawianowski, 1974), where K: II* x II* -* C 1 is a Fourier transform of K: 

K(£, v) = \k(x, y) exp [-i«£ x) + <i,, y»] dy dx (24) 

Equation (23) is easily recognized to coincide with the functional equation for 
factors of projective representations of the abelian additive group II* (Barg- 
mann, 1954). Even without any appealing to the theory of projective represen- 
tations, it is easy to show (in elementary terms) that the only smooth and 
bounded solutions of (23) are given by: 

K(£, n) = exp m ® V, B)] = exp [iB^&Tfc] (25) 

where B e II ® II is an arbitrary real, contravariant tensor of the second order 
on II (Bargmann, 1954). 

The correspondence principle suggests that we put B proportional to 
f : B = bf , because of the fundamental role of T in the geometry of P. If we had 
chosen any other form of B, we would have broken the symplectic symmetry of 
the problem; there is no sufficient, non-arbitrary reason for any other choice of 


Stawianowski 155 

B. One can easily show that when B = bt, then: 

K(x, y) = exp [^<r, x®y.>] = exp [o^r^-y*] (26) 

where o- is some real number depending on b, and using of h enables us to get 
rid of the physical dimension in the exponent, which is necessary if (26) is to be 
well-defined. It appears that the particular choice of a does not matter — 
associative algebras corresponding to various values of a are isomorphic with 
each other. To attain the correspondence with the currently used notations, we 
put a = 2. (Obviously, there is no physics nor mathematics in any such choice.) 

The non-local Weyl-Moyal product is defined as: 

(A ±B)(p) = Jexp \^(r,p~^®plp)JA{p 1 )B(p 2 ) d/t fc/2 (pi) dfi h/2 (p 2 ) (27) 

Let us quote the following properties of 1 : 

|aiS=[aB = <A*|B) (28) 

(A±B)* = B*±A* (29) 

1±A=A±1 = A (30) 

moreover, the constant function 1 is the only function satisfying (30) for all A. 

A*±A#0 (31) 

unless A is a function vanishing almost everywhere. 

(C±A\B) = (A\C*LB) = (C\B±A*) (32) 

IfC±A=0 = A±£>forallA,thenC = £> = (33) 

Contrary to the pointwise product, the Weyl multiplication is non- 
commutative; its centre consists of constant functions only. 

Besides, let us notice the following asymptotic formulas which give account 
of the correspondence principle: 

lim A ±B = AB (34) 

lim -^(A ±B -B 1 A) = {A, B} 


provided A, B are smooth and A ±B is well-defined. Replacing in (10), (11), 
(12) and (14) the pointwise product AB by the Weyl product A ±B we get the 
non-local statistical theory in P. The property (28) implies that (11) and 
(12) do not change then at all. On the contrary (14) is replaced in a non-trivial 

156 Uncertainty Principle and Foundations of Quantum Mechanics 

way by the following eigenequation: 

ALp = ap (36) 

where both a and p are unknown. We are looking only for probabilistic 
solutions of (36), i.e. such functions (distributions) p which satisfy the normali- 
zation condition and are positively semi-definite in the non-local sense: 

\p dfi = 1 (37) 

(p\A* 1A)= f p(A* LA) dp, = J p LA* LA dp^O (38) 

for all functions A. The subset of R composed of those values of a for which 
probalistic solutions of (36) exist, is in no direct way related to A(P). The 
relationship between solutions of (36) and value surfaces of A is essentially 
non-local. Let us notice that contrary to the classical semi-definiteness condi- 
tion: <p|A*A)sO, (38) does not imply that p >0. Quantum density functions 
are allowed to take negative values. They become positive in the usual, 
pointwise and local sense after coarse-graining over subsets of P which are 
large when compared with critical Heisenberg h n cells. The quantity 

*u=\ P 



can be approximately interpreted as a non-negative probability of localization 
in £/ <= P only when{U) » 1. 

Quantum spectral analysis and quantum measurements, being based on (36) 
are essentially non-local in P. The very structure of (27) implies that this 
non-localness is of the O-order h ", which agrees with the uncertainty principle. 

The non-local Weyl-Moyal statistical mechanics is obviously isomorphic 
with the usual Hilbert space formulation of quantum mechanics. The corre- 
sponding isomorphism, the so-called Weyl prescription attaches to functions on 
P, operators in an appropriate Hilbert space. Denoting the operator corre- 
sponding to A as A, we have: 

[Adp=TrA, aA+bB = aA + bB, ALB=AB, 

'A* = A + , (A\B) = ^A*LB = ^A*B = Tr(A + B) 

The L 2 (R") can be used as a Hilbert space of wave functions. In fact, let (q\ p t ) 
be canonical affine coordinates on P and A(q', /?,) some function on P (obvi- 
ously A itself is a function on R 2n hence A(q',pi) is to be understood as a 
function on P resulting from superposing two mappings: (q\ p t ):P^R " and 
A : R 2n -»• C 1 . Do not take A (q l , p,) to be a value of A at a point of J? 2 "!). Then, 
denoting the corresponding operator by A and the kernel of its integral 


representation as (x'|A|y'>: 

(AV)(x i ) = \(x i \A\y i )W)dny 

Stawianowski 157 


we find: 

^^-(^fh^-^H^-")'* (41) 

(41) is just the famous Weyl prescription. 

A pure state corresponding to the wave function ^(x') is described in the 
Weyl-Moyal language by the following Wigner function (Moyal, 1949; 
Wigner, 1932): 

P = {£)"\** (<?' -y) expHVp,)^' +Yj d„r (42) 

Obviously (42) implies then: 

P-Lp = P (43) 

In contrast to the system of classical 'proper equations': 

Ajp = aip, i = 1 . . . m (44) 

the corresponding quantum systems of eigenequations: 

AiLp = aip, i = l...m (45) 

need not be compatible. The compatibility condition has a form: 

[A„ Aj\ = £t(A, LAj -Aj LA t ) = C&± (A k - a k ) (46) 

where C% are some functions (Dirac, 1964). 

One can show that p describes a pure state, i.e. (42) or equivalently (43) is 
satisfied if and only if it satisfies a maximal (impossible to non-trivial extending) 
compatible system (45). 

In more rigorous terms: let B denote the space of such functions A on P so 
that the corresponding operators A (via the Weyl prescription) are bounded. 
Let peB describe a quantum statistical ensemble: 

Jp = Trp = l, <p|A*±A>>0 

Now, we introduce the following space of functions: 

E p = {AeB:ALp = 0} (47) 

Then, p is a pure state if and only if E p is a maximal left ideal in the associative 
(although non-commutative algebra (B, L). When functions (Ai-Oi) occur- 

158 Uncertainty Principle and Foundations of Quantum Mechanics 

ring in (45) generate such an ideal, then any consistent system of eigenequa- 





Ai±x = aiX, F±x = 

is equivalent to (45), hence, there exist functions F t such that: 

F=lFy±(A,-a y ) 

For an arbitrary, not necessarily pure statistical ensemble p, a real subspace of 
E p composed of real-valued functions is a Lie algebra under the quantum 
Poisson bracket (i.e. it is closed under this operation). 


(Information and Symmetry of Statistical Ensembles) 

There are two kinds of asymptotics describing the correspondence principle: 
'large quantum numbers' and 'small values of the Planck constant'. They are 
supposed to be essentially equivalent, however, up to now there is no rigorous 
and general proof of this equivalence. Besides, in spite of some current views 
neither of these methods leads automatically to classical laws when starting 
with quantum theory. Some kind of physical intuition and 'feeling' is necessary 
to avoid mistakes; there are some dangers and traps typical in either of these 

Roughly speaking, the asymptotics of large quantum numbers consist of the 
limit transition: 

An -» oo, n / An -» co 


where n is a mean quantum number of a physical situation and An is a spread 
of quantum numbers (a width of a quantum state is n representation). In 
technical terms: 

n »An »1 


Quantum formulae should approach then those derived from the classical laws. 
The typical danger of such a method is the neglect of the condition An » 1 . In 
particular, let {*&„} be the wave functions of stationary states of a bounded 
system (e.g. an atom). In general, it is not true for large values of n that ¥„ 
become quasiclassical throughout the whole configuration space. (By quasi- 
classical we mean here the following: interpretable in terms of geometric 
objects of the classical Hamilton-Jacobi theory.) However, when superposing 
such quantum states with coefficients slowly varying inside the interval (n - 
An/2, n +An/2) and vanishing outside of it, the non-classical terms of the 
various V„ approximately cancel each other, provided (50a) is satisfied. In such 
expressions the functions ¥„ can be replaced by their classical counterparts 
^cijt built from the Hamilton-Jacobi objects and described by continuous 



Siawianowski 159 

'quantum numbers' k. Obviously summation over n is then to be replaced by 
integration over k. 

Basing the asymptotics on large quantum numbers is a convincing physical 
procedure. Its main idea is the cancelling of interference in situations described 
by rapidly oscillating wave functions (just those with large quantum numbers or 
short de Broglie waves). 

The asymptotics of 'small values' of the Planck constant is of a rather formal 
nature. The Planck constant ft is treated then as a free parameter of the theory. 
All fundamental mathematical expressions of quantum mechanics are analytic 
functions of ft on the positive real semi-axis: < ft < co. Quasiclassical analysis 
is based on the expansion of these expressions in asymptotic series about the 
dangerous point ft = 0. Passing over from finite to vanishing ft is connected with 
some qualitative discontinuities. As a rule, the afore-mentioned asymptotic 
series are divergent. According to the correspondence principle, their lowest 
order terms are expected to coincide with the appropriate classical expressions. 
The structure of asymptotics is essentially the same as that in the 'large 
quantum numbers' approach: when ft -* 0, quantum interference phenomena 
break down because we are then dealing with rapidly oscillating functions. In 
fact the basic quantum formulas involve expressions such as exp(iW/ft) where 
W does not depend on ft. Obviously they become rapidly oscillating when 
ft -> 0. This is just the idea of the method of stationary phase (Born and Wolf, 
1964; Erdelyi, 1956). 

Technically, the ft -*• approach is much easier and more 'automatic' than 
that based on n -» oo. However, it is rather formal and one must not forget that 
the asymptotics of small ft are only the convenient conceptual shorthand of the 
physical asymptotics based on large quantum numbers. In fact, the transition 
ft-»0 transforms formally the whole conceptual structure of the quantum 
theory into a classical one, but it has nothing to do with the real laboratory 
conditions of 'quasiclassicality'. Physics is interested rather in answering the 
question 'what are quasiclassical situations in the real world, when the Planck 
constant has a fixed value?' Of course, it is only the n -> oo asymptotics that are 
able to answer this question. It shows that quantum laws both kinematical and 
dynamical, when applied to states compatible with (50) asymptotically degen- 
erate to classical laws. Conditions (50) justify even the possibility of an 
approximate description of a quantum system in terms of the classical notion of 

Let us notice that in situations described by (50), ft is small when compared 
with typical values of physical quantities, which are then of the order ftn , ft An. 
This is why the asymptotics ft ^ is justified, but only as a shorthand for 'large 
quantum numbers'. 

Typical dangers of the ft ^ asymptotics are as follows: Let {*„} again be the 
stationary states of a bounded system. As calculated by means of the ft- 
dependent Schrodinger equation, ^„ depend in addition on ft. Let us indicate 
this explicitly by using ¥(„,»■ I* would be meaningless to calculate lim/,-,0 ^(n,h) 
and expect any relationships with classical expressions; moreover, as a rule, 

160 Uncertainty Principle and Foundations of Quantum Mechanics 

such a limit does not exist at all. The reason is that it is only the fixed quantum 
number n, or equivalent^ the number of nodes of ¥„, that describes the 
quantum structure of a state and decides to what extent %„, h) is essentially 
quantum or quasiclassical. The varying of ft in ¥ (n , h) when n is fixed, would be 
non-physical, because the scale, order of 'classicality' of ¥„ depends only on n. 
To be supported by an extreme example: we could consider a ground state 
corresponding to the smallest possible value of n. It is obvious that there is 
neither classical limit nor anything reasonable in putting ft ->0 in a wave 
function of a ground state because n is then fixed and small, and ft -> is only 
the theoretical shorthand for n -»oo. Similarly, it is meaningless to apply the 
ft -*• asymptotics when studying spinning particles, because then some quan- 
tum numbers (describing internal angular momentum) are small. 

In the following, we are using the methods of ft ->0 asymptotics, carefully 
comparing some results with the analysis of the order of quantum numbers. Of 
course, we remain within the Weyl-Wigner-Moyal framework. This formula- 
tion of quantum mechanics, being based on the geometry of classical phase 
spaces is especially convenient when studying quasiclassical phenomena and 
the correspondence principle. 

As we have mentioned in the previous section, the non-local Weyl-Moyal 
operations reduce in the limit ft ^0 to the local, pointwise operations. More 


Let A, B be smooth functions over P, for which the Weyl product A LB is 
well-defined (the integral (27) converges). Expanding (27) into an asymptotic 
series about ft = by means of the method of stationary phase (Born and Wolf, 
1964; Erdelyi, 1956), we get: 

ALB=AB+^{A,B} + 



lim A IB = AB 



lim f:(A IB -B -L A) = {A, B} 



Hence the pointwise product and the classical Poisson bracket are classical 
limits of the Weyl product and the quantum Poisson bracket (Moyal bracket), 

On the quantum level, the structures of Lie algebra and associative algebra 
are directly, algebraically related to each other. In fact the quantum Poisson 
bracket [A, B]= l/fti(A LB-BLA) is algebraically built from the associa- 
tive product (via a commutator). It is not the case on the classical level: {A, B} 
involves differentials of A, B and fails to be an algebraic function of A, B. This 
is the main qualitative discontinuity of the classical limit ft -> 0. There are 
serious physical consequences of this fact. Namely, let us consider a left 

Stawianowski 161 


ALp = ap (53) 

where A is an arbitrary physical quantity (A* = A, A is Hermitian) and p is a 
statistical ensemble: \p dp, = 1, \p{B* LB) dp, >0, for arbitrary B, p* = p. 

Obviously (53) is the Weyl-Moyal counterpart of the left operator eigenequ- 
ation for density operators: 

Ap = ap (53a) 

Taking the complex conjugate of (53) and subtracting it from (53), we get via 

or, in operator terms: 

[A,p] = ^(A±p-p±A) = 




But (54) means that the statistical ensemble p is invariant under a one- 
parameter unitary group generated infinitesimally by A. 

Example: let A = L t n' be a component of the angular momentum along the 
axis given by the unit vector n. Now, let p be a statistical ensemble with the 
sharply defined value m of Vno 

Lin Lp = mp 

Then, p is invariant under the one-parameter group of rotations about the n 
axis. The records of the detector in a scattering experiment remain unaffected 
when the source producing particles in such an eigenstate p, is subject to 
rotations about the n axis. 

Therefore, the informative property (53) (no statistical spread when measur- 
ing A on p), implies the invariance property (54). This is not the case on the 
classical level, i.e. for vanishing ft, because the equation 

Ap = ap (55) 

possesses solutions for which the equation: 

{A,p} = 

is not satisfied. 


Roughly speaking: On the quantum level (finite ft) information implies 
symmetry. On the classical level (vanishing ft) this relationship breaks down. 
Therefore, the purely formal methods of the ft -*■ asymptotics are unsatisfac- 
tory and insufficient when studying classical counterparts of eigenstates. In fact, 
on the quantum level, the eigenstates of A are uniquely defined by (53) or 
(53a). Therefore, the equation (52) suggests that we define classical eigen- 
values and eigensembles of A as those satisfying (55). However, from the 

162 Uncertainty Principle and Foundations of Quantum Mechanics 

physical point of view, it is hard to accept such an approach, because the 
eigenstates would then lose their fundamental physical symmetries. The physi- 
cal qualitative understanding of the correspondence principle rather suggests 
that we should define classical eigenstates of A as those satisfying both (55) and 
(56) i.e. the pair 

Stawianowski 163 

Ap = ap 
{A, P } = 


where a is a regular value of A. 

In fact, from the physical laboratory point of view, the notions of information 
and symmetry are essentially independent of each other in spite of their 
accidental relationship within the framework of Hamiltonian quantum 
mechanics. They involve two different kinds of physical operations: 

(1). Measurements, mainly scattering experiments and statistical analysis of 
spreads of results, recorded by films or counters for example. 

(2). Transformations, motions performed on the experimental set-up, e.g. 
on sources and detectors. The main example being rigid translations and 

Therefore, to retain the physical structure of eigenstates, we have to define 
them on the classical level by means of (57). This is confirmed by the 
quasiclassical scattering experiments. 

When A is an analytic function, then the only probabilistic solutions of (57) 
are distributions: 

p = F8{A-a) 


{F,A} = (A-a)G 
for some function G. The equation (59) is equivalent to: 

{F,A}|M A , a -0 


M iA , a) = {peP:A(p) = a} 





Remark: F in (58) need not be differentiate, we assume only that {F, A } does 
exist, i.e. there exists the derivative of F in the direction of the vector field dA 
which is related to the Pfaff form dA via the T-lowering of indices: 

(dA) a = r" 

, b dA 


In general, (57) or equivalently (59a) possesses many solutions. To restrict this 
arbitrariness it is necessary to perform additional measurements or in 
mathematical terms to add some similar conditions. Hence, let us consider the 

i — \...m 


following system of classical eigenconditions: 

Atp = a t p 

{A„p} = 

where a = (a t . . . a m )eR m is a regular value of the mapping A = 
(A u ...,A m ):P^R m (this mapping transforms peP onto 

(A 1 (p)...A m (p))eR m ). 
Contrary to the system of proper equations: 

Aip = a t p, i = 1 . . . m (61) 

which is always compatible, the eigenconditions (60) need not be compatible 
and as a rule they are not. 

To be more precise we give the following definitions. A classical system of 
eigenconditions (60) is said to be compatible, when: 

(1). It is completely integrable in the sense of the theory of systems of 

differential equations; 
(2). It is regular in the sense that a = (a x . . . a m ) is a regular value of the 

analytic mapping 

A = (Ar...A m ):P-*R m 

One can easily show that the regular system (60) is compatible if and only if 
there exist functions CJJ such that: 

{A„ A y } = {A,- - a,, Aj -aj}= C^A k - a k ) (62) 

When A i . . . A m are functionally independent (which ensures the regularity of 
(60) then the only probabilistic solutions of (60) have the form: 

p = FS(A 1 -a 1 )...(A m -a m ) (63) 


{A i ,F} = FV v +G i l (A,-a i ) (64) 

for some functions G\. 

Classical compatibility conditions (62) are obvious counterparts of the 
corresponding quantum equations (46). They possess a geometric interpreta- 
tion which is interesting from the point of view of both physics and mathema- 
tics. In fact let 



= {peP:Aj(p) = a h i = l...m} 


{A,, A ; }|M A>a = {A, - a„ Aj - a ; }|M A>a = 



Equations (62) and (62a) imply that M (A , a ) is what Dirac, Bergmann, Goldberg, 
and others have called I-class constraints (Bergmann, 1966; Bergmann, 1970; 

164 Uncertainty Principle and Foundations of Quantum Mechanics 

Bergmann and Goldberg, 1955; Dirac, 1950; Dirac, 1951; Dirac, 1955 a,b; 
Dirac, 1964; Stawianowski, 1971; Stawianowski, 1975; Tulczyjew, 1968). 

Roughly speaking, the submanifold M<= P is said to be of first class if any 
vector T-orthogonal to M must be at the same time tangent to M. More strictly 
let p e M and let TpM denote the linear subspace tangent to M at p. Now let 
k e II be such a vector that for arbitrary u e TpM: 

<r,fc®u> = r afc fcV = (65) 

M is a first-class submanifold, if for arbitrary peM, (65) implies: k e TpM. 

Let V(M) denote the set of analytic functions vanishing on M for arbitrary 
subset M<=P: 

V(M) = {f€C(P):/|M = 0} (66) 

Obviously, V(M) is an ideal in the associative algebra C°(P), i.e. it is closed 
under pointwise multiplications by arbitrary analytic functions. What is inter- 
esting is that V(M) is at the same time a Lie algebra (i.e. it is closed under the 
Poisson bracket) if and only if M is a first-class submanifold. In particular let p 
be an arbitrary classical probability distribution and 

% p ={feC*(P):fp = 0} (67) 

[compare with (47)]. Obviously, % P = V(Supp). For arbitrary p, % is an 
associative ideal. It is a Lie subalgebra if and only if the support of p, Supp p, is a 
I-class submanifold, i.e. if p satisfies some compatible system of classical 
eigenconditions. Let us remember the corresponding property of the quantum 
space E p . The analogy is obvious. 

In the following, the ideals V(M) corresponding to the 1-class manifolds M 
will be called self -consistent. Any such ideal has a form g p = V(Supp p) where p 
satisfies some compatible system (60). Probability distributions p such that % 
is not self-consistent (Suppp fails to be a 1-class submanifold) must not be 
looked on as classical counterparts of quantum states. They are non- 
interpretable from the point of view of the correspondence principle, because 
they violate the uncertainty principle on the quasiclassical level. A typical 
example of such a 'wrong' distribution is: 

p=FS(q 1 -a 1 )8( Pl -b 1 ) (68) 

Both q 1 and its conjugate momentum p t are simultaneously dispersion-free on 
such a p, hence Suppp fails to be of 1-class because {q\p 1 }=l # and p is 
non-interpretable in terms of the quasiclassical theory. 

On the quantum level special attention is paid to the pure states which carry 
the maximal possible information and could be defined as those satisfying a 
maximal system of compatible eigenequations (45), i.e. answering a maximal 
number of operational 'questions'. The corresponding left ideal E p [cf . expres- 
sion (47)] then becomes maximal. The question arises as to the classical 
counterparts. The correspondence principle suggests that we look for classical 

Stawianowski 165 

distributions p for which % p is a maximal self -consistent ideal. Roughly 
speaking, such a distribution p satisfies some maximal system of compatible 
eigenconditions (60): 'maximal' means here that any extended system of 

is consistent only if 

Atx = a t x Fx = 
{A h x} = {F,x} = 

F = Y J {A i -a j )F i 



for some functions F t . Hence (60a) is equivalent then to (60). 

Obviously, ? p is a maximal self -consistent ideal of the type V(M) if and only 
if Supp p is a minimal closed analytic submanifold of I-class in P. The lowest 
possible dimension of such a manifold equals the number of degrees of 
freedom: n = 1/2 dim P. The I-class manifold becomes then what mathemati- 
cians call the lagrangian manifold i.e. maximal T-self -orthogonal manifold. In 
rigorous terms: M <= P is said to be an isotropic manifold if any vector tangent to 
it is at the same time T-orthogonal to M that is if for an arbitrary pair of vectors 
tangent to M at p : u, v e TpM <= II ; the following orthogonality equations hold : 

(r,u®v) = T ab u a v b (69) 

One can easily show that dim M^n. 

An isotropic manifold is said to be lagrangian if there are no vectors 
transversal to it and T-orthogonal to it at the same time. Obviously an isotropic 
manifold is lagrangian if and only if its dimension equals n (Weinstein, 1971; 
Weinstein, 1973; Stawianowski, 1971; Arnold, 1974). 

When Supp p is a closed and connected lagrangian manifold, then obvi- 
ously % p = V(Suppp) is a maximal self -consistent ideal in C°{P) and p is a 
classical counterpart of a pure quantum state. Typical and suggestive examples 
are as follows: 

p = 8(q 1 -a 1 )...8(q n -a n ) (70a) 

P = 8(p l -b 1 )...8(p n -b n ) (70b) 
(where a and b are constants), or, in a six-dimensional phase space P: 

p = 8(L i n i -m) 8(L 2 -l 2 ) 8{\p 2 + V(r)-E) (71) 


T k j T 2 v T 2 „2 _ V « 2 

Li = £ijq'p k , L =LLj, p -LPi, 

2 v ' 2 

r =Io 

n is a three-dimensional unit vector, and m, I, E are constants. Obviously 
equations (70) describe classical statistical ensembles with sharply defined 
values of q\ p„ respectively (on the contrary p, and q' then become completely 
undetermined according to the classical uncertainty relations). Expression (71) 

166 Uncertainty Principle and Foundations of Quantum Mechanics 

is an ensemble with denned values of the nth component of the angular 
momentum, the square of the angular momentum and energy. Hence, (71) 
corresponds to the classical partial wave analysis. Distributions (70) and (71) 
are especially convenient when analysing classical scattering experiments. 
They are directly related to the Hamilton- Jacobi theory. Moreover, it appears 
that properties of classical probability distributions concentrated on langran- 
gian submanifolds are interpretable in terms of the classical Hamilton- 
Jacobi theory. There is nothing surprising in this: such distributions are 
classical counterparts of pure quantum states which admit the description in 
terms of wave functions. But it is known even from elementary textbooks that 
the h->0 asymptotics of the Schrodinger equation imposed on the wave 
function ¥ = VDexp(iS/fi), leads to the classical Hamilton-Jacobi equation 
imposed on the phase S and to the continuity equation for probabilistic fluids 
(with velocity-field given by the gradient of 5) (Landau and Lifshitz, 1958; 

Messiah, 1965). 

Let us investigate these problems in some details. We start with interpreting 
some of our results in terms of wave functions. 

Let us consider a pure quantum state, the wave function of which is V(q' ) (q, 
Pi are canonical affine coordinates on P). The corresponding Wigner function p 
is given by (42) 

P = (£)"\V* [q l ~] e*PHVW* («' + f) d »- < 42 > 

Let us put: 

V = JDexp(-S\ 


We will write p[D, S] to indicate explicitly the functional dependence of p on 
D, S. D is assumed to be continuous and S twice differentiable. We aim to find 
the classical limit of p[D, S] expressed in terms of D and S. In the lowest order 
of WKB-approximation, D and S are assumed to be independent of h (Landau 
and Lifshitz, 1958; Messiah, 1965). Moreover, the h -independence of the 
physical interpretation of D and 5 is obvious even without any use of the 
WKB-approximation. In fact D is a probability distribution for positions and S 
is related to a spread of momentum: 

<¥|<3'|<P> = Jd^ 1 . . . q")q' dq 1 ... dq" (73) 

W\P t \V> = \D^ ...q") dMq 1 ...q n )dq l ... dq" (74) 

provided \P is well-behaved at infinity (Mackey, 1963). These formulas do not 
involve the Planck constant explicitly. 

Let M s be a maximal submanifold of P on which all functions pj-diS(q') 

[pi-diSiq'nMs^O (75) 

Siawianowski 167 

It is obvious that M s is a lagrangian submanifold: 

{p, - dMq"), Pi ~ djS(q k )} = (dfj- $S(q k ) = 

Now, let us introduce the following probability distribution p ci [D, S] on P, 
concentrated on M s : 

Pcl [D, S] = mqf 8ip 1 -d 1 Siq k )) • • • 8(p n -d n S(q k )) (76) 

Obviously, p cl is a classical probability distribution, non-negative in the usual 
local sense. Nevertheless as far as we are interested in analysing measurements 
of first-order polynomials of canonical affine coordinates, p c i[D, S] is essen- 
tially equivalent to p on the rigorous quantum level (finite «): 

= ^(a i q i +p i p i )p[D,S] 


Of course for higher order polynomials this formula breaks down (excepting 
polynomials depending on q' only). However, in the classical limit p becomes 
exactly equivalent to p cI . In fact, making use of the afore-mentioned indepen- 
dence of D and S on h in the lowest order of WKB-expansion we find 

lim h ^o pW,S] = Pci [D,S] (77) 

Obviously, the limit is to be understood in the sense of generalized functions 
(Schwartz, 1950-1951). 

Now, let H be an arbitrary analytic function on R 2n and H, the operator 
corresponding to Hiq 1 . . . q",pi . ..p n ):P^R in the sense of the Weyl pre- 
scription [cf. (40) and (41)]. We assume H to be bounded in L 2 (R"). 

Let us consider the eigenequation 

fiV(q) = EV(q) (78) 

where VeL 2 (R n ). We put ^ = VDexp(iS/n). Equation (78) implies that 

H(q, p)±p[D, S] = Ep[D, S] (79) 

H{q, p)±p[D, S]-p[D, S]±H(q,p) = (80) 

In the classical limit, these equations become 

Hiq, p) Pc lD, S] = Ep cl [D, S] (81) 

{Hiq,p),p c W,S]} = (82) 

Expression (81) implies that 

Hi...q i ...,...d i S(q k )--.)^E (83) 

which is none other than the time-independent Hamilton-Jacobi equation 
imposed on S. In geometric terms it means that 

Hiq,p)\M s ^E (84) 

168 Uncertainty Principle and Foundations of Quantum Mechanics 



dS p with 

statical weight 
D(q) = ^*(q)^(g) 


Figure 1 Wave functions and phase-space distributions. Systems of dots represent an 
exact quantum Wigner function p[V] = p[D, S]. The surface M s given by equations 
Pi = dS/dq' is a support of the classical probability distribution: 

/ dS\ I dS\ 
PcM = PclD, S] = D(q) S[p i -— 1 ) ■ ■ ■ S(p n -—) 

Both p[D, S], pA.D, 5] lead to identical expectation values for canonical affine coordi- 
nates in the phase space. When h->0,p approaches p cl and the above system of dots 
shrinks to M s . 

Restricting the canonical vector field AH to M s (this is possible, because &H is 
tangent to M s ) and projecting it to the configuration space (the quotient space 
of P with respect to fibres on which all q' are constant) we get the following 
S-velocity field V[S] 

V[SJ = d n+i H(. ..<?'...,... djS(q k ) . . .) (85) 

For example when 


and (82) is equivalent to: 

H—gWi+fil 1 ) 



SB D(q) = 



Siawianowski 169 

where <e nS }D(q) denotes the Lie derivative of D(q) with respect to V[S]. 
Expression (86) means that D(q) is invariant during the motion, hence, it is a 
continuity equation for quasiclassical stationary states. As a geometric object, 
D(q) is a density of weight one, hence (86) is a shorthand for 

V[SH diD(q)+D(q)^f-=0 

Finally the quasiclassical counterpart of (78) is a couple of conditions: 

H(...q l . 

.dMq k )...) = E 

nsUiDiq^ + Diq") 9 -^- 




Conditions (87) could be derived directly from (78) after substitution of (40), 
(41) and (72) and expanding it up to the first asymptotic order in h. One should 
only make use of the method of stationary phase; conditions (87a) and (87b) 
result then as a real and imaginary part of the asymptotic formula respectively 
(Tulczyjew, 1968). 

A system of equations (87) is completely integrable if and only if the 
conditions (62) are satisfied: 

{H a ,H b } = C ab c (H c -E c ) 


The maximum possible number of independent and compatible simultaneous 
eigenconditions (87) imposed on the same pair D, S equals the number of 
degrees of freedom n. Any system of n independent compatible eigencondi- 
tions (87) possesses a solution which is unique up to an additive constant in S 
and a multiplicative one in D. It is essentially equal to the quasiclassical 
solution found by Van Vleck (1928) and proved by him to coincide with the 
asymptotic WKB solution. In more detail let A,:i? 2 "^i?, i = l...n be 
analytic functions such that the corresponding phase space functions A t (q,p) 
are in the involution 

{A i (q,p),A i (q,p)}=0 (88) 

Let us consider the corresponding compatible system of eigenconditions 
imposed on the quasiclassical wave function V D exp(iS/h): 

A,(..y...,...aySte k )...)«<ii (89a) 

V,[SH djD(q k )+D(q k ) djVtiSl/ = (89b) 

The autonomous subsystem (89a) imposed on 5 possesses the unique (up to an 
additive constant) solution which depends in a parametric way on constants a t . 
Let us insert this dependence explicitly into S by introducing a function 
S:R 2n ^R such that S(q', a') is a solution of (89a) for arbitrary values of a ; . 
Obviously, S(q',a l ) is a complete integral for any of the Hamilton-Jacobi 
equations (89a). Substituting S(q\ a') into (89b) we get an autonomous system 
of first-order differential equations imposed on D. It is completely integrable 

170 Uncertainty Principle and Foundations of Quantum Mechanics 

and one can show that the only solution (up to normalization) of this system is 
given by the Van Vleck determinant: 

D(q',a l ) = Det 

d 2 S 

dq da'W 


In particular, S can be chosen to be an arbitrary complete integral of the 
stationary Hamilton-Jacobi equation 


.8jS(q k ,a k )...)^E 


where H is a Hamiltonian function of the mechanical system. 

The geometric interpretation of the Van Vleck solution in terms of symplec- 
tic geometry has been given in Stawianowski (1972). To some extent, the Van 
Vleck object can be guessed a priori, in terms of pure geometry, without any 
appeal to the quasiclassical approximation. 

Obviously the quasiclassical Wigner-Moyal density in P corresponding to 
the Van Vleck solution of (89a) is given by 

p ci [D, S] = 5(Aj - a x ) . . . S(A n - a„) 

,2 r ii 

J 5(Pi ~diS(q, a))... 8(p k -d k S) 

= Det 

d 2 S 

\dq da' 


Remark: When p is a classical probability distribution the support of which is a 
closed connected lagrangian submanifold then % p is a maximal self -consistent 
ideal in C(P) and consequently p is a quasiclassical pure state. Contrary to 
what might be expected the converse statement is false. In fact let H : P -» R be 
such a Hamiltonian that the corresponding dynamical system does not possess 
any non-trivial constants of motion (excepting the Hamiltonian itself, of 
course). This is the case for example when the system is ergodic (any classical 
trajectory is dense on some value-surface of H-. The quotient sets of value- 
surfaces of H with respect to the congruence of classical trajectories (integral 
curves of AH) fail to be differential manifolds in any natural way. Then it is easy 
to see that the only solution up to a constant factor of the classical eigencondi- 

Hp = Ep {H,p} = 
is the microcanonical ensemble 


<&„ is a maximal self -consistent ideal in C°(P), although Suppp fails to be a 
Lagrangian submanifold (it is only a 1 -class submanifold of P). 

We have shown above in what way and in what sense the asymptotics h ■* 
lead to the Hamilton-Jacobi theory of classical mechanics. The crucial point 
was that the purely formal asymptotic rules were inadequate to get physically 
reasonable and satisfactory results. They had to be completed by qualitative 
physical ideas such as information and symmetry properties. The necessity for 

Slawianowski 171 

such 'supplements' is a typical feature of the formal h -» approach, in contrast 
to the more physical although less elegant asymptotics of large quantum 

The physical analysis based on information and symmetry properties of 
statistical ensembles leads to results which do not agree with some current 
views concerning pure states. It is well-known that pure states carry the 
maximal information about the system. Hence it is reasonable to conjecture 
their classical counterparts to be distributions or measures the supports of 
which degenerate to the subsets of phase measure zero. One commonly 
believes that they are Dirac measures concentrated on single points in a phase 
space. Hence within the framework of such, currently used analogy classical 
pure states and points of the phase space are essentially identical notions. In 
spite of these views but in agreement with the old Hamiltonian ideas of Synge 
and Dirac (Dirac, 1964; Synge, 1953; Synge, 1954) we have shown that 
quasiclassical probability distributions corresponding to pure states are con- 
centrated on lagrangian submanifolds the dimension of which equals the 
number of degrees of freedom. Hence, to any quasiclassical pure state there is 
attached some set of the usual, classical states — a lagrangian manifold in a 
phase space. Each of these classical states is taken with its own statistical 
weight. Hence classical states, that is points in a phase space, are 'hidden 
parameters' of quasiclassical ones. We have presented both physical and a 
priori geometrical arguments in support of our views. Let us notice that our 
approach agrees nicely with the Bohr-Sommerfeld conditions of the old 
quantum theory. In fact the objects upon which these conditions are imposed 
are none other than lagrangian manifolds in a phase space. Let 
(. . . q' ....... pi . . .) be canonical affine coordinates on P; we introduce the 

Pfaff form o> 

a>= Pi dq' (93) 

Obviously in arbitrary affine coordinates 
Now, let M a be a Lagrangian submanifold 


'I* ={p eP: A, ; (p) = Oh i = l...n} 

{A„A,} = 



The Bohr-Sommerfeld conditions imposed on Ji a mean that the line integral 
of w over any closed curve in M a , equals the Planck constant multiplied by some 
integer characterizing a loop of integration. When M a are topologically equiva- 
lent to tori which is the case in the theory of multiply periodic systems, these 
conditions give rise to non-trivial restrictions on the admissible quantized (in 
the Bohr-Sommerfeld sense) values of physical quantities A x . . . A n . 

The currently used analogy between pure quantum states and points of the 
phase space is based on the properties of Gaussian-shape Wigner functions. In 

172 Uncertainty Principle and Foundations of Quantum Mechanics 

spite of its formal correctness this argument is physically wrong. Let us 
investigate this in some detail. Let (q\ p,) be canonical affine coordinates and 

H laJ>) = \l(q'-a') 2 +ll(Pi-b t ) 2 


Obviously, H iayb) is a Hamiltonian of the harmonic oscillator the equilibrium of 
which is given by the point in P with coordinates (a, b t ). The corresponding 
Gaussian-Wigner function is given by 

E (a ,b) = (rrh) " exp |^ --H (a , 6) J 


One can easily show that E (a , b) is a Wigner function describing some pure state 
(2Trh)- n ^E (a , b) dq 1 ...dp n = l (98) 

E( a ,b) ~ E( a ,b) 
E(a,b)- i -E( atb ) = E( a ,b) 

^E (aJ>) (A*±A)dq l ...dp n >0 (101) 


(As a- matter of fact, E (a , b) is positive for arbitrary A in the local sense too: 
£ W) >0). One can easily show that 

lim Et ab) = 8(q 1 -a l ) . ..(p„-b„) 



This could be understood in such a way that at least some pure states possess 
classical counterparts concentrated on single points (point measures). However 
this is only a typical example of mistakes arising when no care is taken with 
regard to the scale of quantum numbers. In fact E {a , b ) describes the physical 
situation with small quantum numbers. To see this let us notice that it is 
essentially concentrated in the region the phase volume of which is of the order 
h". Moreover, E iaJ>) is none other than the ground state of the harmonic 

-ff(a,i.)-L-E(o,6) - inhE( a ,b) 


where n is the number of degrees of freedom. Therefore according to what we 
have mentioned above the h -» asymptotics of E ia _ b ) is physically meaningless. 
We finish this chapter with some remarks concerning quasiclassical problems 
of the superposition principle and interference of amplitudes. Let us consider 
the wave mechanics in R". An arbitrary wave function ¥ = a exp(iS/h) gives 
rise to some geometric figure in R n+1 , namely the graph of its phase: 

Pm = P[<r, S] = {(x, S(x)):x eR n } 


Slawianowski 173 

Now let us consider an arbitrary m -parameter family of wave functions 

{* a =a- a exp(^S a ), aeR m } 
and the corresponding system of phase diagrams 

PWa\ = PWa,Sa\ 

Let <p = fi exp(it/h) be some complex amplitude on R m . It gives rise to the 
continuous superposition 

¥ = a exp^-S = Ufa)^ da 1 ... da" 


which can be represented by its phase graph P\^f\ 

As usual in the lowest order of the WKB approximation we assume a a , S a , fi, 
t not be dependent on the Planck constant. Let °^ = °a exp(i°S/h) denote the 
lowest order term of the asymptotic expansion of ^ about h = renormalized 
so as to retain the probabilistic interpretation in the classical limit: 

(V|>)=f V¥ = 

The corresponding phase diagram will be denoted as P\°"V] = P\°cr, °S]. 

Making use of the method of stationary phase when calculating (105) we get 
the result that essentially P[ ^] is an envelope of the family 

{P[V a . 4>(a)} = {PUt(a) . a a , t{a) + S a \} 

This is just the quasiclassical Huyghens interference rule. More rigorously 

exp - i sign d 2 (S ( .)(x) + <f> (•))o (x)J 

V/i hes aoU) (S(.)(x) + <£(•)) 

= Nn (a (x))o-a„(x)(*) exp ^-S^Cs) + <t> (a (x)) J 

exprJTri sign d 2 (S ( .)(x) + ^(OWq] 
Vfc"hes fl0(l) (S(.,(x) + ^(-)) 
where a (x) is the solution of equations: 


- 7 [s ( . ) (x)+<M-)]=o 



N is an h -dependent normalizing factor ensuring that ^ retains its probabilis- 
tic interpretation and d 2 f y is the matrix of the second-order derivatives off at y, 
hes y / is the determinant of this matrix (hessian of / at y) and sign 8 f y is a 
signature of the symmetric matrix d 2 f y (i.e. a difference between the number of 
positive and negative eigenvalues). 


174 Uncertainty Principle and Foundations of Quantum Mechanics 

When the equation (107) possesses more than one solution, then, (106) 
becomes the sum of terms corresponding to all solutions a (x). 

The Huyghens-Fresnel interference rule (107) gives rise to the classical 
Huyghens superposition principle. In fact when all S a (q') are solutions of the 
same Hamilton-Jacobi equation (83) then °S (<?') corresponding to the 
envelope of 

{P[V a • <t>(a)] = P[p-{a) • <r a , t(a) + S a ]} 
is a solution of this equation too (Caratheodory, 1956). The envelopewise 
superposition principle for the classical Hamiltonian-Jacobi equation appears 
to be the correspondence principle counterpart of the usual linear superposi- 
tion principle for the Schrodinger equation (Stawianowski, 1971; 
Stawianowski, 1975). In particular one can solve initial (boundary) problems 
for the Hamilton-Jacobi equation by the envelopewise superposition of classi- 

Figure 2 The classical Huyghens superposition rule. Continuous superposition of 
wave functions ¥„ = a a exp(tf a /ft) with coefficients <p(a) = M«)exp[<f(a)/ft], V- 
o-exp(iS//») = jV(a)<kda. When h^O, the diagram of S becomes an envelope of 
diagrams of {S a + t(a)} (i.e. the diagram of °S). 

Stawianowski 175 

cal 'propagators' (Stawianowski, 1971; Stawianowski, 1975) with initial 
boundary conditions as 'coefficients' of 'superposition' (Stawianowski, 1975). 


In the previous section we have derived some results concerning the correspon- 
dence principle for mechanical systems with affine symmetry of degrees of 
freedom. The Weyl-Wigner-Moyal formulation enabled us to achieve this in 
an almost deductive way. Unfortunately, this method does not work for general 
mechanical systems. This is why we have paid special attention to structures 
which did not depend explicitly on the particular properties of the space of 
states. Now starting with these structures we present some general statements 
concerning the analogy and correspondence between classical and quantum 
theory, with special attention to the quasiclassical compatibility problem. The 
geometric guidance from the theory of mechanical systems in affine spaces 
suggests some general analogies between classical and quantum concepts 
without explicit calculations and asymptotic expansions. To justify these 
analogies strictly one should appeal to the correspondence principle formu- 
lated in terms of large quantum numbers. However we will not do this here 
because we believe that the general information and symmetry properties of 
statistical ensembles do not depend on the particular structure of the manifold 
of states. 

We start with the classical probability calculus of discrete sets. Let / be a 
countable set of elementary events. No additional structure in / is supposed; it 
is only a set. 

Let C(I) denote the linear space of complex-valued functions over /. 
Obviously, C(I) is an associative algebra under the local, pointwise product. 
Statistical ensembles are described by probability distributions on I, i.e. real 
and normalized functions p on J: 

pf^pd), P (i)^o, E P (0=i 


Physical quantities, i.e. random variables are described by real functions on I, 
A: A*(i) = A(i). Expectation values are given by the usual formula: 

{A) p =lA(i)p(i) 


Now let % p be a linear subspace of C(I) composed of all random variables 
which vanish on the support of p 

% p ={FeC(I):F P = 0} (108) 

Obviously % p is an ideal in the associative algebra C(I) — it is closed under 
pointwise multiplications by all elements of C(I). All random variables in % p 

176 Uncertainty Principle and Foundations of Quantum Mechanics 

are free from statistical dispersion on the p-ensemble; all measurements give a 
sharply defined result namely zero. In particular let Fe % p and F-{A a) 
where a is some constant. Then A has a sharply defined value a eA{I) on p 
and the following eigenequation is satisfied: 

Ap = ap (109) 

The greater g p is, the more information is carried by p: % pi <= % n implies: 

£ Pl (i) In pi(i) < I p 2 (') In P2O) 


Roughly speaking by imposing additional eigenequations (109) upon p we 
increase its informative content, because the number of measurements with 
completely predictable results is then increased: Now, let us assume that % p is a 
maximal non-trivial ideal. Then, p answers the maximal number of non-trivial 
questions, i.e. the maximal number of measurements has unique, certain 
results (the maximal number of eigenconditions (109) is satisfied). Such a p is a 
pure state of the classical probability calculus. It is obvious that for an arbitrary 
pure state p there exists such a point ie/ that: P (j) = 8 ij , i.e. p is a point 
measure and: 

pp = p (HI) 

Information (entropy) takes then its maximal (minimal) i.e. vanishing value: 

Ip(01np(0 = 


All physical quantities A have then sharply defined values A (i) and % p consists 
of functions vanishing at i e J. Then % p is maximal if and only if all random 
variables are dispersion-free on p. This is just the main peculiarity of the 
classical probability calculus. 

Now let us turn to quantum statistics. The state space of a quantum system is a 
separable Hilbert space H. Let B(H) denote the associative but non- 
commutative algebra of linear bounded operators in H. Physical quantities are 
described by Hermitian elements of H: A = A. 

Statistical ensembles are described by density operators p i.e. Hermitian, 
positively definite and trace-class elements of B(H): 

p + = p, Trp = l, Tr(pA + A)>0foranyA 
An expectation value of A e B(H) on p is given as: 

<A> p = Tr(Ap) 
Similarly, as in the classical probability calculus we define: 

E p = {FeB(H):Fp = 0} 




Obviously, E p is a left ideal in the associative algebra B(H); it is closed under 
all transformations: F^GF. Algebraic and physical interpretation of E p is 

Stawianowski 177 

analogous to some extent to that of % p in classical statistics. In fact, E p describes 
all measurements which, performed on the statistical ensemble p give sharp 
results without any statistical dispersion. Hermitian elements of E p describe 
physical quantities which take a definite value, namely zero, when measured on 
p. When AeB (H) is a physical quantity (A + = A) which takes a value a e Sp A 
on the ensemble p without any statistical spread, i.e.: 

then obviously 

Ap = pA — ap 

F=(A-aI)eE p 


The greater E p is the more experimental questions are uniquely (spread-free) 

answered by p. Similarly as in the classical case E PI c 
inequality for information (entropy): 

Tr(px In pi) < Tr(p 2 In p 2 ) 

- E n implies the following 

Pure states are defined as those answering a maximal number of experimental 
questions i.e. such as that E p is a maximal (non-trivial) left ideal in B(H). It is 
easy to show that for an arbitrary maximal left ideal E p there exists a 
one-dimensional linear subspace V<^H such that E p consists of operators 
vanishing on V: 

E p ={FeB(H):F\V=0} 
This implies p to be a projector mapping H onto V and 




Similarly as in classical statistics information (entropy) then takes its maximal 
(minimal) value 

Tr(plnp) = 


All the above equations in B(H) are formally analogous to the classical ones in 
C{I). However, the non-commutativity of B(H) involves us in serious physical 
differences. In fact when p is a classical probability distribution describing a 
pure state (point measure; ^-maximal) then obviously, the quotient space 
C(I)/% P is isomorphic with the one-dimensional field of complex numbers C . 
Therefore the pure ensembles of classical statistics are characterized by definite 
values of all random variables; they satisfy eigenequations (109) for any 
A 6 C(I). This is not the case in quantum theory. Even if peB(H) is a pure 
state (E p is maximal) there exist physical quantities which take no definite value 
on p. A physical quantity A is spread-free on p if and only if the image V of p 
(i.e. V=p(H) is an eigenspace of A : 

A\V=aId v 

There exists one more peculiarity of quantum statistics which has no counter- 
part in the classical probability calculus. It is also strictly related to some formal 

178 Uncertainty Principle and Foundations of Quantum Mechanics 

algebraic differences between C(I) and B(H). In fact, the only algebraic 
structure carried naturally by C(I) is that of commutative associative algebra 
over the complex field C. On the contrary, the non-commutative associative 
product in B{H) (superposition) gives rise to another non-trivial algebraic 
structure namely that of Lie algebra under the quantum Poisson bracket: 

[A,B] = ±-.(AB-BA) (120) 

which is skew-symmetric and satisfies the Jacobi identity: 

[[A,B],C]+[[B,C],A] + [[C,A],B] = (120a) 

As we have mentioned above all informative essentially statistical properties of 
statistical ensembles were described in terms of the associative algebra struc- 
ture (ideals, E p , % p , eigenvalues, eigenequations and so forth). By contrast the 
structure of Lie algebra in B (H) is strictly related to the symmetry properties of 
statistical states. In fact, any Hermitian element A eB{H) gives rise to two 
physically distinct kinds of physical operations: 

(1) One can measure A on a statistical state p and perform a statistical 
analysis of the spread of the results. The corresponding mathematical 
expression describing the statistical dispersion is given as: 

cr 2 (A,p) = Tr(A 2 p)-(Tr(Ap)) 2 (121) 

Spread-free eigenensembles satisfy the operator eigenequation: 

Ap = pA = ap (122) 

Hence, informative, statistical notions are related to the associative 
algebra structure. 
(2). A gives rise to the one-parameter group of unitary automorphisms of 

the theory: 

B(H)9B^B I = exp[^A]Bexp[-^A]eB(/f) (123) 

An infinitesimal description of this group is given in terms of the Lie 
algebra structure: 


B, = [B„A] 


B = B, hence: 


= [£,A] 


Automorphisms (123) preserve all laboratory measurable quantities^ i.e. 
expectation values and probabilities of detection, [Tr(Ap), Tr(p!p 2 )J. This is 
because they preserve both the trace and the associative product that gives ris 

Stawianowski 179 

to the following compatibility conditions for the associative and Lie algebraic 

[A, BC] = [A, B]C+B[A, C] 


As we have mentioned in the previous chapters from the laboratory point of 
view information and symmetry properties are essentially independent of each 
other. Nevertheless in Hamiltonian quantum mechanics they become interre- 
lated namely information implies symmetry. In fact the operator eigenequation 
(122) implies that 

[A,p] = 

Therefore the lack of statistical dispersion of A on the ensemble p implies that 
p is invariant under the one-parameter unitary group generated by A. This 
implies that ReiJ p that is the real linear subspace of E p composed of the 
Hermitian operators is a Lie algebra over the field of real numbers. This is true 
for an arbitrary density operator peB(H). 

To summarize the above considerations we compare now the main algebraic 
and physical features of quantum statistics and classical probability calculus: 

(1). If p is a classical probability distribution then % p is an associative ideal in 
C(I) (cf. (108)). This structure gives an account of the informative 
properties of p. 

(2). If peB(H) is a density operator then: (a) E p defined in (115) is a left 
ideal in the associative algebra B(H). This structure describes the 
informative properties of p. (b) Re E p is a real Lie subalgebra in the Lie 
algebra B (H) (under the quantum Poisson bracket). This structure gives 
an account of the symmetries of p implied by its informative properties. 

Now let us investigate the corresponding structures in classical Hamiltonian 
mechanics. In contrast to the previous sections we will not assume the affine 
geometry of degrees of freedom. 

Hence we must appeal to the general formulation of mechanics based on 
symplectic geometry (Abraham, 1967, Arnold, 1971; 1974; Sternberg, 1964; 
Hermann, 1970; Kostant, 1970; Souriau, 1970). Let us start with some 
mathematical preliminaries: 

The classical phase space of a mechanical system is a pair (P, y) where P is an 
analytic differential manifold (the set of classical states of a system) and y is a 
non-degenerate and closed differential two-form on P of class C"{P): if 
(y, X®Y) = for an arbitrary vector field Y, then 

X = 
dy = 
Making use of local coordinates f " on U^P, we have: 

r|t/=ra6d^ a Ad£* 


180 Uncertainty Principle and Foundations of Quantum Mechanics 


dettoJ^O (126a) 

y a „c + y^ + TW = (127a) 

Hence dimP = 2n where n is an integer, the so-called number of degrees of 

^TteDarboiix theorem (Abraham, 1967; Sternberg, 1964) implies the 
existence of canonical coordinates C such that: 


"HI-/ oil 

[compare with expression (4)]. One uses then the historical notation: (O- 
(q\ Pi) and: 

y |L7 = dp j Adq" (128a) 

The contravariant skew-symmetric tensor reciprocal to y will be denoted as y : 

-8° (129) 

yab7 : 


We will also write y ab simply instead of y" . 

The Poisson bracket of the differentiable functions F, G is defined as: 

{F,G} = <dF®dG,f> (13°) 

In the coordinates: 


^ dF dG 

It is skew-symmetric and (127) implies that the Jacobi identity holds hence the 
Poisson bracket turns C°(P) (and (TiP)) into Lie algebra. 

Now let F be a differentiable function of P and dF— its differential. Raising 
the index' of the Pfaff form dF by means of y, one obtains a contravariant 
vector field on P, the^sp-called Hamiltonian vector field generated by F. One 
denotes it as dF: (y, dF®Y)= -(dF, Y); in the coordinates: 

The components of dF with respect to the canonical coordinates (q\ p t ) are 

given by: 

(dF _dF\ 

The tensor y gives rise to the skew-symmetric scalar product and skew- 
symmetric orthogonality of vectors. We say that two vectors attached at the 
same point p e P are y -orthogonal if: 

(y P ,u®v) = y pab u a v" = (131) 

Stawianowski 181 

Obviously, all vectors are self -orthogonal (skew symmetry of y). Hence 
similarly as in the previous section we can define I-class submanifolds and 
lagrangian submanifolds in P: 

The submanifold M <= P is said to be of I-class if any vector y-orthogonal to 
M is at the same time tangent to M. 

M<=F is called isotropic if any vector tangent to M is at the same time 
y-orthogonal to M. 

Mc? is said to be Lagrangian when it is both I-class and isotropic. 
Obviously, the dimension of lagrangian submanifolds equals the number of 
degrees of freedom, n = 5 dim P. Now, let: 

M = {peP = F(p) = 0, i = l...m} 

M is a I-class submanifold if and only if the Poisson brackets of F, vanish 

{F,F ; }|M = 


{F h F,}=C%F k 


for some functions C,y (Dirac, 1964). 

Any I-class submanifold M of the co-dimension m is foliated by a family 
K(M) of m -dimensional isotropic submanifolds in such a way that all vectors 
tangent to this foliation are y-orthogonal to M (Tulczyjew, 1968; 
Slawianowski, 1971, Dirac, 1950; Bergmann and Goldberg, 1955). K(Af) is 
called the characteristic or singular foliation of M. Some global properties of 
the characteristic foliation are important in quasiclassical problems. The I-class 
submanifold M is called simple when its quotient set P(M) = M/K(M) carries 
the natural 2(n-m) dimensional differential structure of class C* projected 
from M. When M is simple, then any smooth function on P(M) gives rise to 
some non-trivial function on M which becomes constant when restricted to any 
fibre of K(M). For example when M is a value surface of the Hamiltonian H 
then the corresponding dynamical system is completely degenerate and admits 
(2n - 1) non-trivial independent and autonomous constants of motion (the 
Hamiltonian itself included among them). The quotient manifold P(M) of a 
simple submanifold M carries natural phase space structure, because y\M is 
projectable from M to P(M). The resulting phase space (P'(M), y') is called the 
reduced phase space of M; it describes gauge-free physical degrees of freedom 
of Af. 

Now let us describe an opposite case: I-class submanifold M is said to be 
primitive when the only smooth functions on M constant on fibres of K(M) are 
those constant all over the whole M. For example the value surfaces of an 
ergodic Hamiltonian are primitive submanifolds; the only non-trivial constant 
of motion is the Hamiltonian (energy) itself. 

Obviously the arbitrary lagrangian submanifold is both simple and primitive. 

182 Uncertainty Principle and Foundations of Quantum Mechanics 

The skew-symmetric tensor y gives rise to the nowhere vanishing differential 
form of maximal possible degree 2n, namely: 

yA ... Ay (133) 

It is well-known that such a form gives rise to some measure on P (Abraham, 
1967; Schouten, 1954; Siawianowski, 1975). It is convenient to divide it by 
(2Trh) n = h"(ci Section 2) ; the corresponding dimensionless measure on P will 
be denoted as p„ or simply p. Obviously when using canonical coordinates 
(q 1 . . . p n ), the measure p consists in 

j/d / * = (^)"j/d< ?1 ...d< ? "d Pl ...dp„ (134) 

for an arbitrary smooth function / the support of which is contained in the 
domain on which coordinates fa 1 ...*) are defined. The existence of a 
canonical measure enables us to describe statistical ensembles in P by means ot 
non-negative scalar functions or distributions p normalized to unity: 

[pA*AdpsO for any A 
I p dp = 1 


Physical quantities are described by analytic functions on P. The linear space of 
all analytic functions on P, C(P) carries two natural algebraic structures: (a) 
C(P) is an associative algebra (obviously a commutative one) under the 
pointwise product; (b) <T(P) is a Lie algebra under the Poisson bracket 

operation. . 

In contrast to the quantum case, the associative and Lie algebraic structures 
in C(P) are algebraically independent which gives rise to the separation of 
information and symmetry properties of statistical ensembles. Nevertheless, 
they are compatible in the sense that the Lie structure gives rise to the 
derivations of associative structures, i.e. the Leibniz rule is satisfied: 

{A,BC} = {A,B}C+B{A,C} 


which is an obvious counterpart of (125). 

Hence just as in quantum theory an arbitrary physical quantity A e C {P), 
A* = A can be related to two kinds of physical operations: measurements and 
transformations. . 

Statistical analysis of A is based on the well-known formulas of classical 
probabilistic calculus involving associative algebraic structure on C(P): 

(A)„ = | Ap d M ( 13g ) 

a 2 (A, p) = \ A 2 p dp. - (| Ap dp)' = (A\-(Af p (139) 

Slawianowski 183 

Spread-free statistical ensembles of A satisfy the eigenequation 

Ap = ap (140) 

A one-parameter group of automorphisms of C°{P) {canonical transforma- 
tions) generated by A consists of transformations B^>B, such that 


B t = {B„A}, B = B 




= {B,A} 


Such transformations preserve both the associative product (due to expression 
(137) and its Poisson bracket (due to the Jacobi identity). Hence, they preserve 
all measurable quantities of the theory. 

Now we are able to investigate classical counterparts of quantum ideals E p 
and pure states in some detail. Let us start with some definitions. 

An ideal V <= C* (P) in the associative algebra C° (P) is said to be probabilis- 
tic if there exists such a subset M <= p that 

V={FeC°(P):F\M = 0} (142) 

Such an ideal is denoted as V(M). Let p be an arbitrary probability distribution 
and let us put 

C(P)^9„={FeC( t P):Fp = 0} (143) 


i" p = y(Suppp) 


which justifies the name we have used above. Such an ideal describes uniquely 
the space of all physical quantities which are dispersion-free on a given 
statistical ensemble. 

Any ideal in associative algebra C"(P) is contained in some probabilistic 
ideal. As an example let us mention an ideal of all functions which vanish on a 
given subset, together with their derivatives up to some fixed order. 

An associative ideal V<^C°{P) is called self -consistent when it is at the same 
time a Lie subalgebra of C°(P). A probabilistic ideal g p is self-consistent if and 
only if Supp p is a first-class submanifold of P. When physical quantities F, 
Ge.C (P) are simultaneously spread-free on p and £ p is self-consistent, then 

{F,G}|Suppp = 0, i.e.{F,G}6tf p (145) 

These are just the classical compatibility conditions. In particular, when both 
(q\ p^ are dispersion-free on p, then ^ p fails to be self-consistent. Hence, the 
geometric a priori of symplectic manifolds seems to anticipate on the purely 
classical level the Heisenberg uncertainty principle. 

184 Uncertainty Principle and Foundations of Quantum Mechanics 

In the classical probabilistic calculus of discrete sets we had no compatibility 
restrictions; all probability distributions and all ideals % p were admissible and 
consistent. This is not the case in Hamiltonian mechanics. The 
correspondence-principle analysis based on information and symmetry sug- 
gests that the only physically justified probability distributions p on P are those 
for which ideals % p are self-consistent. Hence, the classical probability distribu- 
tion the support of which fails to be a I-class submanifold is only a technically 
convenient shorthand for probability distributions closely concentrated around 
the 'support' but essentially smeared out beyond it. Hence, the lowest possible 
dimension of the support of the probability distribution satisfying classical 
compatibility and the uncertainty restrictions, equals the number of degrees of 
freedom (the lowest dimension of an I-class submanifold). In particular, point 
measures are incompatible. 

As we have mentioned in the previous section classical counterparts of 
pure states are probability distributions p for which % p is a maximal self- 
consistent probabilistic ideal. In particular, any probability distribution the 
support of which is a connected closed lagrangian lagrangian submanifold is a 
quasiclassical pure state. Probability distribution p[D, S] (76) is the most 
typical example because of its obvious relationship with the wave functions ¥ 
through their phases S and their moduli D. 

However it is interesting that Supp p need not be a lagrangian submanifold 
to be able to ensure the maximality of the probabilistic ideal % p . In fact when p 
is an arbitrary distribution the support of which is a primitive submanifold (cf . 
definition above) then £ p is maximal provided Suppp is analytic and con- 
nected. Although such a distribution is pure in the sense of answering the 
maximal number of compatible questions (measurements), it is hard to relate it 
to any wave function. The problem of quasiclassical interpretation of such 
distributions from the point of view of the correspondence principle is still 
open. As a typical example we refer to the microcanonical ensembles of ergodic 
Hamiltonians. Hence one can hope that such distributions are in some sense 
related to the quantum ergodic theory (Ludwig, 1961). 

Remark: Let M be an arbitrary submanifold of P not necessarily a self- 
consistent one. There exist self -consistent ideals in C(P) (associative ideals 
being at the same time Lie algebras) all elements of which vanish on M. 
However they are of a non-probabilistic type; any such ideal is a non-trivial 
proper subspace of V(M). A typical example is 

V(M) = {feC(P) : f(p) = 0, d/ p = 0, p eM} 


When p is a classical probability distribution and Supp p is a primitive analytic 
submanifold, then obviously g p =V(Suppp) is a maximal self-consistent 
probabilistic ideal and p is a conceptual counterpart of a pure quantum state. 
However if we omitted the word 'probabilistic' in the above statement it 
would become false because there exist essentially larger self-consistent ideals 

Stawianowski 185 

of a non-probabilistic type. In fact, let peP and let U p <= TpP be some 
n -dimensional isotropic subspace of the tangent space at p, i.e. n -dimensional 
linear space of pairwise y-orthogonal vectors attached at p e P: 

(y p , w®v) = y P a b w a v b = 

provided w, v e U p . Now let us put 

S(U P ) = \feC°(P) :f(p) = 0, d/ p e U p ) 



One can easily show that S(U P ) is a self-consistent ideal in C° (P) moreover it is 
a maximal self-consistent ideal. Let M<=P be an arbitrary lagrangian sub- 
manifold and let peM. Then obviously 



and this is a non-trivial proper inclusion. 

Hence the point measures in P are related to maximal self -consistent ideals 
of a non-probabilistic type in C° (P) ; Let S p be a Dirac measure concentrated at 
peM. There exists a system of In generators of S(U p ) namely F x . . . F 2n 
such that: 

{F h 8 p } = 


Nevertheless, £(5 P ) is essentially larger than S(U P ) and this is why the point 
measures violate the relationship of information and symmetry suggested by 
the correspondence principle, although the 2n invariance conditions = 
{F h Sp] are satisfied. 

Let us finish our chapter with some remarks concerning the quasiclassical 
description of projectors in terms of symplectic geometry. We present only 
general ideas; more detailed information is given in (Stawianowski, 1971; 
Stawianowski, 1975). 

As we have mentioned above, there exists an exact relationship between 
lagrangian submanif olds in P and phases of quasiclassical wave functions. It is a 
well-known peculiarity of the quasiclassical WKB- approximation that all 
relations between phases become separated so as to satisfy some autonomous 
closed algebra quite independent of what happens to the moduli of the wave 
functions. Let D(P) denote the set of all closed lagrangian submanif olds in P. 
Now let M <= p be a simple closed submanifold of P (cf. the definition above) 
and let D{M) <= D(P) denote the set of all closed lagrangian submanif olds of P 
contained in M (i.e. JteD(M) if and only if M<^M). D(M) is non-empty 
because M is a I-class submanifold (compatibility conditions satisfied by 
quasiclassical wave functions). 

One can show that except for some special cases of singular intersections 
there exists for an arbitrary MeD(P) only one (AmJI) € D(M) such that 



186 Uncertainty Principle and Foundations of Quantum Mechanics 

The natural mapping A M : D(P) -*D(M) satisfies the following rules: 
(1). It is a retraction of D(M): 

\ M \D(M) = id D( M) ( 152 ) 

in particlar, it is idempotent: 

A M °A M = A M (153) 

(2). When M, N, Mf)N are simple and closed, then: 

A m °Ajv = A N °A M = A M rw (I 54 ) 

(3). When 

A M °A N = A N °A M (155) 

then M ON is a simple submanifold and (154) is satisfied. 
(4). When / : P -* P is an arbitrary canonical mapping then: 

A /(M ) = -FoAm° j F'" 1 ( 156 ) 

Where F:D(P) -*D{P) is a mapping in D(P) induced in an obvious way by /. 
(By canonical mapping we mean here an analytic diffeomorphism of P, 
preserving y.) 

A M describes quasiclassical projection on to a subspace of quasiclassical 
wave functions characterized by definite sharp values of those physical quan- 
tities which are described by smooth functions, constant on M. 

Example: Let us consider an afline phase space with canonical affine coordi- 
nates (q\ p^, obviously: y = dp t a dq'. Let us put: 



M = {peP:pi(p) = b) 
D{P)BM={peP:q i (p) = a i , i = l...n} 

A M M = {peP:pi(p) = b,q 1 {p) = a 2 ,...,q n (p) = a n } 


Hence fixation of the value of p x by means of A M results in a complete 
indeterminacy of q l (quasiclassical uncertainty principle). 

Quasiclassical theory becomes complete when besides the projectors A M , 
classical Huyghens-Fresnel superpositions are taken into account (cf. the 
definition at the end of the previous section). The actual definition and 
properties of A M are based on the geometric a priori of a symplectic manifold 
endowed with the second order Pfaff form y with a local description dp, a dq'. 
Similarly, the geometric structure of the Huyghens-Fresnel superposition can 
be deduced from the geometric a priori of the so-called contact manifolds, the 
geometry of which is based on the first-order Pfaff form ft with a local 
representation: -dz +p, dq' (Stawianowski, 1971). The contact manifold is a 
fibre bundle over symplectic manifold with a one-dimensional fibre. Roughly 

Stawianowski 187 

speaking it arises from the classical phase space when the action variable (i.e. 
the phase of a quasiclassical wave function) is taken into account as an 
additional dimension (Souriau, 1970; Arnold 1974). 


Abraham, R. (1967) Foundations of Mechanics, Benjamin, New York. 

Arnold, V. I. (1971) Obyknovennyie Differetsialnyie Uravneniya, (Ordinary Differential Equa- 
tions, in Russian), Nauka, Moscow. 
Arnold, V. I. (1974) Matematicheskie Metody Klassicheskoj Meckhaniki (Mathematical Methods 

of Classical Mechanics, in Russian), Nauka, Moscow. 
Arnold, V. I. and Avez, A. (1968) Ergodic Problems of Classical Mechanics, Benjamin, New York. 
Bargmann, V. (1954) 'On unitary ray representations of continuous groups' Ann. Math., 59, 1-46. 
Bergmann, P. G. (1966) 'Hamilton-Jacobi and Schrodinger theory in theories with first-class 

Hamiltonian Constraints', Phys. Rev., 144, 1078-1080. 
Bergmann, P. G. (1970) Quantisation of the Gravitational Field, Aerospace Research 

Laboratories-Report ARL 70-0066. 
Bergmann, P. G. and Goldberg I. (1955) 'Dirac bracket transformations in phase space', Phys. 

Rev., 98, 531-538: 
Born, M. (1925) Vorlesungen iiber Atommechanik, Springer, Berlin. 

Born, M., Heisenberg, W. and Jordan P. (1926) 'Zur Quantenmechanik. II', Z. Phys., 35, 557-615. 
Born, M. and Jordan, P. (1925) 'Zur Quantenmechanik', Z. Phys., 34, 858-888. 
Born, M. and Wolf, E. (1964) Principles of Optics, Pergamon Press, London. 
Caratheodory, C. (1956) Variationsrechnung undpartielle Differentialgleichungen erster Ordnung, 

B. G. Teubner, Leipzig. 
Dirac, P. A. M. (1950) 'Generalized Hamiltonian dynamics', Canad. J. Math., 2, 129. 
Dirac, P. A. M. (1951) 'The Hamiltonian form of field dynamics', Canad. J. Math., 3, 1. 
Dirac, P. A. M. (1958a) 'Generalized Hamiltonian dynamics', Proc. Roy. Soc. London, A 246, 

Dirac, P. A. M. (1958b) 'The theory of gravitation in Hamiltonian form', Proc. Roy. Soc. London, 

A 246, 333. 
Dirac, P. A. M. (1964) 'Hamiltonian methods and quantum mechanics', Proc. Roy. Inst. Acad. Sect. 

A, 63 49-59. 
Erdelyi, A. (1956) Asymptotic Expansions, Dover, New York. 
Froman, N. and Froman, P. O. (1965) JWKB Approximation, North-Holland Publishing Co., 

Heisenberg, W. (1925) 'Uber Quantentheoretische Umdeutung kinetischer und mechanischen 

Beziehungen', Z. Phys., 33, 879-893. 
Heisenberg, W. (1927) 'Uber den anschaulichen Inhalt der quantentheoretischen Kinematik und 

Mechanik', Z Phys., 43, 172-198. 
Hermann, R. (1970) Vector Bundles in Mathematical Physics, Benjamin, New York. 
Kostant, B. (1970) Lecture Notes in Mathematics, Springer, New York. 
Landau, L. D. and Lifschitz E. M. (1958) Quantum Mechanics, Pergamon Press, London. 
Ludwig, G. (1961) 'Axiomatic quantum statistics of macroscopic systems (ergodic theory)', in 

Ergodic Theories, Proceedings of the International School of Physics 'Enrico Fermi', XIV Course, 

P. Caldirola, Ed., Academic Press, New York. 
Mackey, G. W. (1963) The Mathematical Foundations of Quantum Mechanics, Benjamin, New 

Messiah, A. (1965) Quantum Mechanics, North-Holland Publishing Co., Amsterdam. 
Moyal, J. E. (1949) 'Quantum mechanics as a statistical theory', Proc. Cambridge Phil. Soc, 45, 99. 
Schouten, J. A. (1954) Ricci Calculus, Berlin. 
Schouten, J. A. and Kulk, W. (\949)*Pfaffs Problem and its Generalizations, Clarendon Press, 

Schrodinger, E. (1926a) 'Quantisierung als Eigenwertproblem, I', Annln. Phys., 79, 361-376. 
Schrodinger, E. (1926b) 'Quantisierung als Eigenwertproblem, II', Annln. Phys., 79, 489-527. 
Schrodinger, E. (1926c) Uber das Verhaltnis der Heisenberg-Born-Jordanschen Quanten- 
mechanik zu der Meinen', Annln. Phys. 79, 734-756. 

188 Uncertainty Principle and Foundations of Quantum Mechanics 

Schrodinger, E. (1926d) 'Quantisierung als Eigenwertproblem, III', Annln. Phys 80, 437-490. 
Schrodinger E. (1926e) 'Quantisierung als Eigenwertproblem. IV, Annln. Phys. 81, 109-139. 
Schwartz, L. (1950-1951) Theoriedes distribution, Hermann Pans. 
SJawianowski, J. J. (1971) 'Quantum relations remaining valid on the classical level , Rep. Math. 

SlSnowskU. J. (1972) 'Geometry of Van Vleck Ensembles', Rep. Math. Phys., 3, 157-172 
SJ-awianowski, J. J. (1973) 'Classical pure states. Information and symmetry in statistical 

mechanics', Inf. /. Theoret. Phys., 8, 451-462. 
SJawianowski, J. J. (1974) 'Abelian groups and the Weyl approach to kinematics , Rep. Math. 

S^wianowski, J. J. (1975) Geometria PrzestrzeniFazowych (Geometry of Phase Spaces, in Polish), 

Polish Scientific Publishers, Warsaw. . 

Souriau J. M. (1970) Structures des Systemes Dynamiques, Dunod, Pans. 
Sternberg. S. (1964) Lectures on Differential Geometry, Prentice Hall New York. 
Synge, J. L. (1953) 'Primitive quantization in the relativistic two-body problem , Phys. Rev., V9, 

Syige",' J. L. (1954) Geometrical Mechanics and de Broglie Waves, Cambridge University Press, 

Svnee,J.L. (1960) Classical Dynamics, Springer, Berlin. ...... A ♦• , „. 

Sniatycki, J. and Tulczyjew, W. M. (1971) 'Canonical dynamics of relativistic charged particles , 

Ann. Inst. Henri Poincare, XV, 177-187. 
Tulczview, W. M. (1968) Unpublished results. 
Van Vleck, J. H. (1928) 'The correspondence principle in the statistical interpretation of quantum 

mechanics', Proc. Nat. Acad. Sci., 14, 178-188. 
Weinstein, A. (1971) 'Symplectic manifolds and their lagrangian submamfolds , Advan. Math., ft, 

Weinstein, A. (1973) 'Lagrangian submanifolds and Hamiltonian systems' Ann. Math., 98, 

Weyl H (1928) 'Quantenmechanik und Gropentheorie', Z. Phys., 46, 1. 
Weyl' H (193 1) The Theory of Groups and Quantum Mechanics, Dover, New York. 
Wign'er, E. (1932) 'On the quantum correction for thermodynamic equilibrium', Phys. Rev., 4M, 




A Theoretical Description of Single Microsystems 


Universitaet Marburg, Germany 

The fundamental relation for the interpretation of quantum mechanics is: 

m=ti(WE) (1) 

E being a projection operator in a Hilbert space 'M and W being a self-adjoint 
operator with W> and tr ( W0 = 1. As it is well known, the trace in (1) can be 
calculated by 

tx{WE) = Y.{<f>^WE<i> v ) 


d> v being any complete orthonormal set of vectors in $f ;(...,.. .) denotes the 
inner product in $f. The real number m in (1) satisfies O^m^l. The funda- 
mental statistical interpretation of quantum mechanics is as follows: m is the 
probability of E. 

Better known than the general form (1) is the special case W=P 4> , P^ being 
the projection operator which projects onto the one-dimensional subspace of 
$f spanned by the vector <p(\\<f>\\ = 1). Then (1) takes on the form 

m=tr(P 4 E) = (< P ,E<p) (2) 

Any experimental test of quantum mechanics employs the relation (1). 

A description of quantum mechanics based on the general formula (1) is 
given in (Ludwig, 1975) and in a more consequent way in Ludwig (in prepara- 
tion a). 

More general, but reducible to (1), is the well-known interpretation that 



(and tr (P^A) = (<£, A<f>) in the case of W=P 4> ) is the expectation value of the 
observable A (in expression (3) A is a self-adjoint operator). 

The Heisenberg uncertainty relation is nothing other than the physical 
interpretation of the mathematical theorem that Heisenberg's commutation 
relation PQ - QP = (h/i)l, holding for the position and momentum operators 
Q and P, yields 

AP-AQ>J (4) 


190 Uncertainty Principle and Foundations of Quantum Mechanics 

with AP and AQ denned by 

AP 2 = tr(W(P-al) 2 ) 
AQ 2 = tr(W(Q-pi) 2 ) 
a=tr(WP),P = ti(WQ) 

The physical interpretation of expression (4) depends on the physical 
interpretation of (3). # 

In order to reduce expression (3) to (1), it is necessary to introduce the 
conception of 'simultaneous measurability', i.e. of 'commensurability'. Usually 
it is assumed that commensurable E x 's (E x being a projection operator in W) 
commute, i.e. E x • E p = E p • E k . By means of a family of commuting projection 
operators and a measuring scale it is possible to deduce (3) from (1). We will 
give such a deduction in Section 5. 

We may summarize: for the interpretation of quantum mechanics, we need 
the following concepts (the physical meaning of which must be exactly 

(1). Probability denoted in expression (1) by the real number m. 
(2). A physical interpretation of the operator W in (1) (or the vector in <f> in 
(2)). The following expressions are often used: Wis the mathematical image of 
an 'ensemble' or of a 'state'. Sometimes the expression 'state' is used only in the 
case W= P*, calling <p the 'state of the system'. Wis also called the 'statistical 
operator' (or the 'statistical matrix', if given in matrix form). 

(3). E in (1) is considered as the image of a 'yes-no observable', a 'yes-no 
measurement'. Also words like 'questions' or 'propositions' are used. 
(4). A conception of 'commensurability'. Sometimes, instead of 'commensura- 
bility', one is speaking of 'measurability at the same time'. 

All discussions concerning quantum mechanics ultimately depend on the 
concepts mentioned above under points (1) through (4). Many misunderstand- 
ings are based on the fact that various authors use the same words for different 
conceptions. All mistakes, all paradoxa are based on inadmissable fictions 
attached to the expressions noted under (1) through (4). It is impossible to give 
here a survey of all the discussions concerned with the various concepts. 

Such discussions do not appear clear enough since, up to now, one has tried 
to clarify the meaning of the concepts under (1) through (4) using common 
language. This was necessary, because the description of single experiments 
and their results had to be given in common language. The gap between 
relation (1) and experiments is much too big to correlate immediately experi- 
ments and theory. One has to use common language to bridge this gap. 

This necessity may be seen if we want to describe an experiment on one 
individual microsystem, e.g. an individual trace in a cloud chamber. There is no 
term in the mathematical framework of quantum theory which could be used as 
an image of this individual experiment; neither the 'probability' m, nor the 
'statistical operator' W, nor the 'observable' A. 


Ludwig 191 

Thus it seems only natural to use common language in describing single 

In the mathematical framework of quantum mechanics, we have the 'set of 
statistical operators', the 'set of projection operators', but there is no set the 
elements of which could be used as 'images' of individual microsystems. The 
concept of an 'individual microsystem' itself is not cleared up theoretically by 
quantum mechanics; it may be understood only intuitively. 

Someone might object that quantum theory contains terms usable as images 
for individual systems in the same way as classical mechanics does: in classical 
mechanics every individual system is described by a point in the T-space, the 
'state space' of classical mechanics; in quantum theory every individual system 
is described by a 'state' 4>, <f> being an element of the Hilbert space %£. The 
surface of the unit sphere in Sff has to be taken as the 'state space' of quantum 
mechanics. In my opinion, this is a false interpretation of quantum mechanics; 
in any case, in quantum theory, it is generally impossible to determine the state 
<f> of an individual system in a single experiment. 

All these problems sketched above motivate the development of a more 
comprehensible mathematical framework for quantum theory, which opens 
the possibility to interpret this framework before using relation (1) and the 
concepts given above under points (2) through (4). The experiments on 
individual microsystems and the statistics of these experiments should be 
described directly by their respective 'images' in this new mathematical 
framework. Then the physical interpretation of quantum mechanics does not 
depend on the 'statistical operators' W and the 'observable operator' A ; the 
physical interpretation will depend only on two fundamental notions just as in 
all classical theories, namely on the notions of 'physical system' and of 

The fundamental concepts of quantum mechanics, in this new form, will not 
be essentially different from those of any other classical theory. The interpreta- 
tion may be done in exactly the same way as in classical theories. The concepts 
mentioned above under points (2) through (4) will not be fundamental, but will 
be derived from the fundamental concepts of physical system and of statistics. 
Their interpretation will be given automatically by their deduction and by the 
interpretation of the two fundamental concepts. If only these two concepts are 
accepted as fundamental, all paradoxa and misunderstandings disappear; only 
mistakes will be possible but mistakes can be corrected. 


The conception of physical system is based on the possibility of making 
'experiments' on such systems. The first step in such experiments is to 'pro- 
duce', to 'manufacture', to 'prepare' the systems. Examples of microsystems: 
an accelerator produces 'ions' ; another accelerator produces 'electrons', a third 
'pairs of particles', using colliding beams. The last examples shows that an 

192 Uncertainty Principle and Foundations of Quantum Mechanics 

individual system is not necessarily an elementary system (in the sense of not 
being composed). An individual system is only one of a large family produced 
by a preparing apparatus. 

After their production, the various systems can be 'recorded' experimen- 
tally. This recording may also be described by saying that 'something has been 
measured on the system'. Such recording of a system may be realized, e.g. by a 
trace in a cloud chamber, by a signal of a counter, etc. 

We will give examples for preparing and recording of microsystems: atoms 
can be produced (prepared) by a canal-ray tube and the photons emitted by 
these atoms may be recorded; electron-proton pairs can be prepared, and after 
the collision between electrons and protons the electrons may be recorded; 
nuclei can be prepared and (in the case of /3 -decay) the electrons emitted by the 
nuclei can be recorded. 

To show that the usual experimental procedures concerning classical systems 
have the same structure, we will give just one example: a gun may 'prepare' 
projectiles and the trajectory of the projectiles can be recorded; e.g. the point 
of impact as a part of the trajectory may be recorded. 

The examples show that the fundamental parts of 'experiments on physical 
systems' are the processes of preparing and recording. 

It is impossible to give here a more detailed description of preparing and 
recording procedures composing the experiments on systems. However, we 
want to stress that preparing and recording procedures may be described 
without referring to the physical systems that are prepared and recorded. A 
gun, for instance, may be described as a procedure (i.e. its construction and 
instruction for use) without referring to a special projectile fired by this gun; 
also the impact may be described without explaining the cause of this event. 
Also, in the case of experiments on microsystems, the experimentator can 
describe the apparatus used to prepare the systems and the apparatus recording 
them. The 'evaluation' of such experiments giving values of quantities defined 
by a theory of the system follows after this description; cross section, 
wavelengths, etc., are calculated only after the description of the experimental 
procedures and the recording of the response of the measuring apparatus has 
been given (e.g. by a computer). 

To give a mathematical picture of such experiments on individual systems, it 
is necessary to introduce a set M, the elements of which shall be 'images' of the 
systems. Given a special atom 'x', in an experiment, the relation xeM should 
be the mathematical form of the proposition: x is a physical system. However, 
the relation x e M reflects the proposition 'x is a physical system' only if the set 
M is endowed with a structure as an image of preparing and recording 
procedures. (A systematical description of this method of theoretical physics, 
employed here in a more intuitive way, is given in Ludwig (in preparation b). 

We will not discuss the question concerning the reasons which allow us to 
speak of given microsystems and, in this sense, of real microsystems of which 
the elements of M are images. The reader interested in such questions will find 
a detailed discussion in Ludwig (1970, 1972a) and in a very short form in 

Ludwig 193 

(Ludwig, 1974a). In these references, a theoretical description is given how to 
'recover' the microsystems, starting by the description of preparing and 
recording procedures only. However, to understand quantum mechanics it is 
not necessary to justify the existence of real microsystems, an existence which 
was more or less founded intuitively in the history. 


According to the short sketches given above, the preparing and recording 
devices have a common structure, namely, that by these procedures physical 
systems can be selected. Therefore, it seems to be useful to begin with a 
mathematical description of this common structure. 

Physical as well as mathematical reasons motivate the introduction of a more 
general species of structure henceforth called a 'selection procedure': 

A subset ^c $>{M) (0>(M) the potential set, i.e. the set of all subsets of M) is 
called a structure of selection procedures or shortly a selection structure on M if 
the following axiom holds (b\a being the relative complement of a in b): 

AS1.1 a,beSf and a<=b^b\ae^ 

2 a,ie^ani€y 

A physicist would like to 'understand' why we have postulated AS 1.1, 2. 
However, it is more or less difficult to 'make plausible' this rather general 
conception of selection procedures. Therefore we can only give some hints: 

If Me if, AS 1.1, 2 has the consequence that & would be a Boolean algebra 
of sets. We have not postulated Me tf, nevertheless, the assumption Me & 
would not lead to mathematical contradictions in the following. However, it 
seems to us that the postulate Me y would be unrealistic on physical grounds. 
To see this, we will try to elucidate the physical significance of AS 1.1, 2. 

The physical interpretation of'xea and a e 9" is as follows: the physical 
system x has been selected by the selection procedure a. In this sense, an 
element aeif represents the method of selecting as well as the family of 
physical systems selected by this method. 

If there are two selection procedures a and b given by special physical 
methods, it is not difficult to construct the following selection procedure: select 
all x e M which are selected by a as well as by b, i.e. the set a (lb. This is the 
meaning of AS 1.2. Of course, a D b = is possible, namely, in the case when 
there are no systems which can be selected by a as well as by b, i.e. if the 
selection procedures a and b are incompatible. 

If a <= b, the selection procedure a is called finer than b. If a is finer than b, 
and if one selects by the method a all x of b, the remaining systems of b are 
those of b \a ; AS 1 . 1 says that the selection of these remaining systems is also 
a selection procedure. 

194 Uncertainty Principle and Foundations oi Quantum Mechanics 

The following two examples show the unrealistic feature of the postulate: 

Me &. , , 

Consider an apparatus producing steel balls. This apparatus is an example of 
a selection procedure for steel balls. Let us denote this selection procedure by 
«„' Then a cM and a e^ hold, m being the 'set of all steel balls'. The set 
M\a is then characterized as the set of all those elements which have not been 
selected at all. The knowledge of the construction of the machine makes it 
possible, to specify various properties of the systems of a. However, there are 
not properties which can be ascribed to the systems of M\a. Therefore, we 
have not included M\a in the set of selection procedures. 

We have a similar situation in the case of an accelerator of electrons. The 
knowledge of the construction of the accelerator yields very essential informa- 
tion on the electrons produced by this apparatus. Such information is necessary 
for any physicist who wants to perform experiments on these electrons. 
However, what can be said about all those electrons which are not produced by 
this particular accelerator? 

In the following, let Sf{a) be the abbreviation for the set: 

Sf{a) = {b\beSe&nAb^a} 

As a consequence of AS 1.1, 2 <f{a) is a Boolean algebra of sets, a being the 

unit element of !f(a). 

To every set 2 c 0>(M) there exists a smallest set Sf of selection procedures 
with Sf => 2. Sf is called the set of selection procedures generated by 2. 


In this section we want to give a mathematical formulation of the second 
fundamental concept of quantum mechanics, i.e. the concept of statistics, of 
probability. Many authors consider the familiar mathematical probability 
theory as sufficient for the foundation of quantum mechanics. Other authors 
state that the quantum mechanical probability given by (1) cannot be formu- 
lated in the framework of the familiar mathematical probability theory. 
Indeed, this last opinion will prove wrong as we will see in the following. There 
are two reasons why we will recall now some essential aspects of the familiar 
mathematical probability theory: 

Reason 1: The axioms of mathematical statistics are so simple that we are 
able to list them in order to give a complete survey of all fundamental concepts 
of quantum mechanics. 

Reason 2: We will formulate the axioms in a more 'physical' form, i.e. in a 
form more suitable for describing experiments with physical systems. 

In the results of experiments, probability appears in the form of frequencies 
with which a selection procedure b finer than a selects systems of a. 

Ludwig 195 

In other words: if, in an experiment, N systems x u . . . x N e M are selected by 
the procedure a and if one selects out of these N systems those N' systems 
which fulfil also the conditions of b, the number N'/N is called the frequency 
with which b has selected systems of a. In many (not in all!) cases the 
experiments show a 'reproducibility' of this frequency, i.e. if one repeats 
experiments employing selection procedures a and b, one obtains nearly the 
same frequencies if N and N' are 'large' numbers. If such a reproducible 
frequency exists, we say that b depends statistically on a. To give a mathemati- 
cal description of such a statistical dependence we introduce a mathematical 
structure called 'statistical selection procedures' or shortly a 'statistical selec- 
tion structure' which is defined as follows: 

A set Sf <= 0>(M) is called a structure of statistical selection procedures if AS 1 
holds and if a mapping A is given, mapping 

3~ = {{a,b)\a,beS/>;a^b and a# 0} 

into the interval [0, 1] of real numbers and if the following axioms hold: 

AS1.1 a 1 ,a 2 eSf,a l na 2 =0,a 1 Ua 2 e9'^> 
A (ai U a 2 , ax) +A («i U a 2 , a 2 ) = 1 

2 ci, a 2 , a 3 eSf, a t =>a 2 ^a 3 , a 2 *0^> 
A(a x , a 3 ) = k(a u a 2 )X(a 2 , a 3 ) 

3 a u a 2 s.y, a^a 2 , a 2 ^0^k{a u a 2 )^d 

X(a,b) is usually called the probability of ' b relative to a. \(a,b) is the 
mathematical picture of the frequency with which b selects relative to a as 
described above. With this 'interpretation' of A (a, b) at hand, the reader may 
easily check the 'physical' significance of axioms AS 2, 1 to 3 (see also Ludwig 
(1975) and in preparation a). 
From AS 2.1 to 3 we obtain: 

A(a,a) = l, A(a, 0) = O; 

and for ai^>a 2 ,ai^a 3 and a 2 0a 3 = 0: 

A(a x , a 2 Ua 3 ) = A(ai, a 2 )+A(tf!, a 3 ) 

By /ju{b) = \(a, b), an additive measure on the Boolean algebra Sf{a) is 
In the sequel, the following definition will be important: 

Definition: A decomposition a = U"=i b t of an aeif with b, ■,* 0, b t e Sf, 
biDb k = if i 5* k is called a 'demixture' of a into the 6,'s and a is called a 
'mixture' of the bC%. A (a, b t ) is called the 'weight' of b t in a. 

From AS 2.1 through 3 we obtain 


196 Uncertainty Principle and Foundations oi Quantum Mechanics 

As we have mentioned in the beginning, we want to give a mathematical picture 
of the procedures by which physical systems are prepared. To this end, we 
introduce a structure on M(M being the set of systems) by a set <2 c 9>(M) (the 
elements of 2 shall be the pictures of the various preparing procedures), for 
which the following axiom holds: 

APS 1 SL is a statistical selection structure. 

The probability function defined by APS 1 will be denoted by A 2 . 'x e a ' is the 
mathematical form of the proposition: the physical system x has been prepared 
by the procedure a. 


It is a bit more complicated to give a mathematical picture of the procedures by 
which physical systems are recorded. The recording process is charactenzed by 
two steps: 

(1). Construction and employment of the recording apparatus. 

(2). Selection according to signals which appeared (or did not appear on the 
recording apparatus employed. 

Accordingly, we define another mathematical structure on M by choosing 
two other subsets of 9{M)- 9i and 91. 9t and 9t satisfy the following axioms: 

91 is a selection structure. 

9t is a statistical selection structure. 

9t ^9l. 

6 e 9t, 6 ^ 6 e 3?o and 6 * 4>6 e ^o- 

To each b e 9t there is a b e 9i with b => b. 

APS 2 
APS 3 
APS 4.1 



In order to describe the physical meaning of APS 2 through 4, we must say of 
what the elements of 9t and 9i are pictures. 

An element b e 9i represents the construction and employment of a record- 
ing apparatus. We may clarify it by an example: The constructed apparatus may 
be a Geiger counter; then b is the set of all those microsystems, to which this 
Geiger counter is employed, x e b is the mathematical form of the proposition: 
the Geiger counter b has been used to record x. This does not imply that a 
recording signal has been produced by x. Therefore, we call 9t the set of 

recording methods. 

The Geiger counter (mentioned above as an example), used to record x, can 
respond or not; b + may be the selection procedure for all those systems xe b to 
which the Geiger counter has responded; hence b + <=b . Correspondingly b 
may be the set of all those x e b to which the counter has not responded; hence 
ft_ = b \b + . b + and b- are elements of 9i. Generally 91 is the set of all those 

Ludwig 197 

selection procedures which are finer than the procedures of 9t ; finer by virtue 
of the influence of the microsystems on the apparatus, represented by the 
elements of 9t . We express this briefly by saying: 9t is the set of all recording 
procedures. Concerning the axioms APS 2 through 4, we will make only 
short remarks; a more general discussion is given in Ludwig (1975 and in 
preparation a). 

APS 3 means that the statistical dependence between the various recording 
methods has nothing to do with the microsystems. In contradistinction to the 
elements of 9t , the selection procedures b e 9t depend essentially on the 
influence of the microsystems. For this reason we did not state in APS 2 that 91 
should be a statistical selection structure. We may illustrate this situation with 
the example of a counter: In nature there are no reproducible frequencies 
A (b , b + ) for the response of the counter as such; \(b , b+) would describe 
frequencies independent of the surroundings of the counter. In reality, the 
frequency of the response of the counter depends essentially on its surround- 

The probability function corresponding to 9t will be denoted by Ag^. 


The first physical problem is raised by the question: which preparing proce- 
dures and measuring procedures may be combined together. Unfortunately, 
this problem is not trivial. 

We define quite naturally: a e.2 and b ^9i are said to be combinable if 
a fl bo* 0. The combination problem amounts to finding axioms (as laws of 
nature in mathematical form) concerning the set 

<€ = {(a, b )\a &St,b o e9t o ,aCib o *0} 

A discussion of this combination problem would be beyond the scope of this 
paper (see Ludwig, in preparation a). Here it seems sufficient to give a very 
simple axiom — though this axiom, in fact, is not very realistic. (This simple 
axiom can be replaced by another more realistic one with essentially the same 
mathematical consequences.) 


£' = {a\ae%a*0} 

9t o = {b o \b o e9t o ,b o *0} 

we formulate as an axiom: 

APS 5 

' = .2'x< 

The central problem of quantum mechanics is the description of the statisti- 
cal dependence of the recording on the preparing process. To begin with, we 


= {c\c = a D b and a e 3., b e 91} 

198 Uncertainty Principle and Foundations of Quantum Mechanics 

An element c = a D b of © is the set of all systems x prepared by the procedure 
a and recorded by the procedure b. Let Sf be the smallest set of selection 
procedures for which 0cy. 

In general, neither Si <= Sf nor 9t <= Sf holds! 

Now we formulate the experience that the combination of preparing and 
recording procedures leads to reproducible frequencies: 

APS 6 & is a statistical selection structure. 

The probability function corresponding to Sf will be denoted by \<?. The 
three probability functions A 2 , A«„, As- cannot be independent for physical 
reasons. Physical experience suggests that there is no dependence of the 
recording methods b e 9l on the procedures a e Si. This fact is expressed by the 
following axiom: 

APS 7 fli, a 2 &Si;a 2 <=-a 1 ;boi, b 02 £9L 

b 2 c b i and ai D b 01 ¥= 

1 Ay(flin6 i, a 2 nftoi) = Aa(«i>«2); 

2 A^CaiDftoi, flinAo2) = Ag8o(Aoi» ^02) 

Axiom APS 7 implies the important theorem (for proof see Ludwig, in 
preparation a): 
The function A<? is determined uniquely by A 2 and the special values 

\AaC\bo, aC\b) 

for a e Si', b e 9t' , b e $ and 6 <= 6 . 

If one looks at the various experiments, it is easy to see that only the values 
XAqClbo, af\b) are tested by experimental physicists. Only one example 
should be sketched: By a preparing procedure a pair of particles may be 
produced to perform a collision experiment. The particles are recorded after 
the collision by a recording method b . Let b (with b <= b Q ) be the recording 
procedure counting if b has responded by a certain signal, a D b are all those 
systems (i.e. pairs of particles!), which are prepared by a and for which the 
recording method b is employed, a D b characterizes the collision experi- 
ment, aC\b are all those systems to which the recording apparatus has 

We define 

9={(b , b)\b e9t' , be9t,b^b } 

and call 9 the set of all effect procedures. 
The following real function is defined onS'xf: 

At (a, /) = /*(«, (b , b)) = \Aa Db ,af) b) 


The function fj.(a,f) defined by (5) plays a central role in the statistical 
description of physical systems, especially of microsystems. The axioms imply 
the following theorem (for proof see Ludwig, in preparation a). 

Ludwig 199 

The function fi(a,f) satisfies the following relations: 

(1). 0</.(fl,/)sl. 

(2). To every a e Si', there is a f e 9 for which /j, (a, f ) = 0. 

(3). To every a e Si', there is a f x e 9 for which n(a, A) = 1. 

(4). Every demixture a = U, «, (see the end of Section 1(B)) implies 

m(U at, f\ =1 Xifi (a,, /) and < A, = A a (o, a,) < 1, 1 A, = 1 

(5). b 01 ^b 02 ^b (b ou b 02 e 9Q and f x = (*„i, *), / 2 = (*<b, *) implies 
At (a, /1) = A^Aoi. b 02 )fi(a, f 2 ) for all a e S' 

(6). Every demixture Z> = li ^. (i-e. b t e%,b i nb k = 0ifii i k) implies (with 

/i = (b , bi)) n 

X fi(a,fi) = l, for all a eS' 


(7). At(a, (^o> 6)) = is equivalent to a D b = 0. 

According to the following theorem (proof by H. Neumann, to be published 
in Ludwig, in preparation a), the statistical structure of the theory is completely 
described by the function At. Given A 2 , the conditions (1), (4), (5) and (7) for the 
function At (a, f) imply the existence of a uniquely defined probability function 
A^> with 

\ y (a r\b ,aC) b) = /x(a, (b , b)) 

Ag8 is determined by At (a, f) = n(a, (b , b)). 

By the formulation of the axioms APS1 through APS 7, we have reached our 
first aim, namely, the definition of the concept of physical systems: those 
components of experiments which are represented by elements of a set M 
(according to the mapping principles of the physical theory) are called physical 
systems if the set M is endowed with a structure Si, 9t, 9t such that the axioms 
APS 1 through APS 7 hold, and if the elements of Si, 91, 9l are pictures of 
preparing and recording procedures (according to the mapping principles of 
the theory). All probabilities concerning the outcome of experiments on a 
physical system are determined by the function At. 

However, this structure is not yet typical for microsystems, as may easily be 
seen in looking at the example of the gun (as an a e Si') and the impacts of the 
projectiles (as a b e 91). 


The next essential step in the development of the theory will be the introduc- 
tion of the notions of ensembles and effects. We will introduce these notions on 
the basis of preparing and recording procedures, it is to be stressed that these 
notions do not agree in every respect with the customary intuitive usage of the 
words ensembles and effects. 

200 Uncertainty Principle and Foundations of Quantum Mechanics 

In the discussion of problems in the interpretation of quantum mechanics, 
many difficulties arise due to the fact that, in using 'common language' for a 
description of experiments, usually no difference is made between preparing 
procedures and ensembles (or 'states'): a family of microsystems, prepared by a 
procedure a &£', is often called an ensemble or a set of systems in a 'state', 
where the ensemble or state is described by a statistical operator W. It is 
impossible to give here a survey of all misunderstandings and mistakes caused 
by not distinguishing between preparing procedures and ensembles. Only in 
the case of the so-called Einstein-Podolski-Rosen paradox, we will demon- 
strate (see Section 8) how the situation will be clarified by the use of the 
concepts introduced here. 

Also the notion of effects (also known as yes-no measurements or questions) 
is used in a different sense by various authors. It may be stressed here from the 
outset that in the Hilbert space representation of the theory the effects are not 
always represented by projection operators! 

Since the function fi is defined by (5) on the whole of the set 2L'y.&, the 
relation defined by 

li(at,f) = ii(a 2 ,ft for all /e^ 

is an equivalence relation on SL'\ a^ ~ a 2 . The notion of ensembles is defined by: 

Definition 1 : Let 3if be the set of all equivalence classes in SL'. An element of 
3Hs called an ensemble (or a state); 3Hs called the set of ensembles (or states). 

The relation 

l i{a,f 1 ) = l i{a,f 2 ) for all a e<2' 

defines an equivalence relation fi~f 2 on &. 

Definition 2: Let SB be the set of all equivalence classes in $F. An element of 
SB is called an effect; SB is called the set of effects. 

The following theorem holds: For w e X, g e SB and a e w, f e g the equation 

fi(w,g) = fi(a,f) 

defines a real function /2 on WxSf. jx satisfies: 

(1). 0*fi(w,g)*l. 

(2). (L(w u g) = p.(w2,g)foTdl\geSe3>w 1 = w 2 . 

(3). iHyv, gi) = /t(w, g 2 ) for all w e SfC^g^ = g 2 . 

(4). There is a g e SB such that /Z (w, g ) = for all w e 3C. 

(5). There is a gi e SB such that fi(w, g x ) = 1 for all we3K. 

Definition 1 gives a precise definition of the notion of ensemble (or of state). 
Nevertheless, we shall give an explanation of this definition since the intuitive 
usage of the notion of ensemble does not always coincide with the notion 
defined in Definition 1. 

An ensemble w is not a set or a family of microsystems, since w is not a subset 
of M. w is a subset of 0>(M), an equivalence class the elements of which are 

Ludwig 201 

subsets of M. It is an important feature of quantum mechanics that a class w has 
more than one element. We will demonstrate this in the case of the example 
given in Section 8. 

It should be stressed that we do not use the notion of ensemble to formulate 
the connection between experiment and mathematical theory, i.e. to intepret 
quantum mechanics. The interpretation of quantum mechanics given here 
depends only on the notions of preparing and recording procedures. Also, the 
notion of ensemble is not necessary for the statistical description which is 
already given by the function A y . The notion of ensemble is used only to analyse 
the structure which already has been founded in Section 1. 

The following definition proves useful in the subsequent analysis of the 
structure of the theory. 

Definition 3: The canonical mapping which maps an element a&SL' onto its 
corresponding equivalence class, w e 3if, will be denoted by <f> ; correspondingly, 
let iff be the canonical mapping of & onto SB. For f = (b , b), we also write 
iff(b , b) instead of tf/(f). 

In the following, we simplify the notation in writing /m instead of /I; the 
arguments in the function will show whether fi is defined on 2! x SF or 5if x SB. In 
this sense, the equality 

/*(«,/) = **(*(«).*(/)) 


The relation (4) of Section 1(E) implies: If a = U< «. is a demixture of a 
preparing procedure a, we have for all g e SB 

n(<t>(a), g) = 1 Xifj,(<f>(ai), g) 


with A, = A 2 (fl, a,), < A, < 1 and X,. A, = 1. 
We define, in analogy to equation (6): 

Definition 4: Let w e 3fC be an ensemble, let A, be a set of real numbers with 
:£ A, ^ 1 and £, A, = 1 and let w ( e 5ST be a set of ensembles such that for all 


/*(w,g) = SA,/*(M' I -,g) 


holds; then (7) is called a demixture of w with respect to the components w, with 
weights A,. 

It is as essential as in the case of ensembles to avoid a false interpretation in 
the case of effects. Similar to the case of the notion of ensembles it is to be 
stressed that effects are classes of effect procedures (b , b). The mapping tj/, 
defined in Definition 3, maps several effect procedures onto the same effect. By 
applying the mapping t//, parts of the structure (02 o , &) may get lost in the image 
SS of if/. This is actually the case as we will see in discussing coexistent effects in 

202 Uncertainty Principle and Foundations of Quantum Mechanics 

Section 5(B). An effect procedure is characterized by a recording method b 
and by a recording procedure b. Two recording methods b o 1 ' and b 2) differing in 
their technical specification together with different responses b m and b m can 
be representatives of the same effect g e if, i.e. 

The frequently used expressions 'yes-no measurement', for the element 
(b , b)oi&i is due to the fact that b represents the response of the apparatus b . 
A more or less precise conception of yes-no measurement is used by many 
authors. Unfortunately, various authors attach a different meaning to the 
words 'yes-no measurement'. Some people do not use this word for the 
experimental situation we are describing mathematically by the elements 
(b , b)oi&. Our mathematical description has the advantage that all possible 
misunderstandings may be avoided. Also, the word question instead of yes-no 
measurement is used by some authors, but not always in the same sense. Our 
definition is: (b , b)e&isa question, b the answer 'yes' and b \b the answer 

In contradistinction to our definition, some authors use the words yes-no 
measurement or question to denote the elements of if which we called effects; 
some authors use these words only for the elements of a subset of if, which we 
shall call 'decision effects', in the following (see Section 3). 

In introducing the notions of ensemble (or state) and effect, we have taken a 
first step towards the 'usual' representation of quantum mechanics. The next 
step is the introduction of the concept of 'simultaneous measurement' and the 
concept of 'observable'. Again, these concepts will be defined with the use of 
the fundamental concepts introduced in Section 1. The interpretation of the 
theory has already been given in Section 1 and new concepts are not introduced 
for the interpretation, but rather for the sake of a structural analysis of the 

However, before introducing new concepts, we will analyse in the next 
chapter the connection between the mathematical symbols introduced so far, 
and relation (1) of Section 1. 


For a 'physical' approach to the structure represented by relation (1), it seems 
best to introduce axioms (i.e. physical laws in mathematical form) for the 
preparing and recording procedures and to deduce from these axioms the 
following theorem: the function (i(w, g) on 3fxif can be represented in the 
form n(w, g) = tx((aw)(Pg)) with injective mappings a, 0. We will formulate 
such axioms without a detailed physical discussion (cf., e.g., Ludwig, 1964, 
1967a, b, c, d, 1968, 1970, 1971a, b) neither will we present the proofs of the 
theorems (see Ludwig, 1970, 1972b; Stolz, 1971 and Ludwig, in prepa- 
ration b). 

Ludwig 203 

Since one always performs only a finite number of experiments (see the 
extensive discussions of the 'finiteness of physics' in Ludwig 1970, 1974b, and 
in preparation b), we assume that the sets M, SI, 3? are denumerable (this is 
equivalent to the assumption that the completions of these sets are separable if 
endowed with a physically meaningful uniform structure — see Ludwig 1970, 
and in preparation b). If M, Si, 3? are denumerable sets, so are % and if. 
The relations given in Section 2 imply the following theorem: 
There is a pair of real Banach spaces 38, 3H' (38' denotes the dual Banach 
space of 38) and an embedding of 3f into 38 and of if into 38' such that 

(1). The canonical bilinear form (x, y) defined on 38x38' coincides with 
fi(w, g) on 3Tx if, i.e. 

fj-(w, g) = (w, g)\ x ^ 

(2). 38 is a base norm space (see, for instance, Nagel, 1974), the base K being 
equal to co $f (co 3C denotes the norm-closed convex set generated by 
3f£)\ the cone, generated by K is norm-closed; 

(3). The linear hull of if is <r(38', 38)-dense in 38'. 

Points (1) through (3) determine 38, and 38' uniquely up to isomorphism. 3K 
denumerable implies that % is separable. 

In the following, we will denote the bilinear form (x, y) by fi(x, y) because of 
point (1) of the theorem. 

38 being a base norm space implies: ^' is an order unit space. Because of 
0<ja (w,g)<l for we3K and geif, it follows 0<^(w, g)^l also for weK 
and geif, i.e. ifc[0, 1], [0, 1] being the order interval between the zero 
element and the order unit. 

We denote by L the set coif, the closure of co if in the a(38', 38 )-topology. 
Let 3 be the norm closure of the linear hull of if. We have 1 e 3. 3 is a 
separable Banach subspace of SB' (3 is also an order unit space). 3 is 
o-(38\ 38)-dense in 38'. K is 5^(38, ®)-precompact and a(9l, ®)-separable. 

38 may be identified with a subspace of 3 ' (3' being the dual Banach space of 
3, 3' is a base norm space), and consequently K may be identified with a subset 
of 3'. Let K be the a(3', ®)-closure_of K in 3'. K is cr(3', ®)-compact and L 
is (38', 38)-compact. For the sets K and L the theorem of Krein-Milman 

The topologies o-(38', 38) and a(3\ 3) are of considerable physical signifi- 
cance: first, the topologies a(3', 3) and <r(3', if) are identical on K and K and 
the topologies o-(38', 38), or(38', K) and o-(3S', T) are identical on L. The 
topologies a(3\ L) on K (and T) and o-(38' T) on L (and if) are suited to 
distinguish through experiment between different ensembles and different 
effects, respectively. This should be clarified, at least to some extent, in the case 
of ensembles: ^(w u g) = fi(w 2 , g) for all g e if implies w r = w 2 . However, in 
experiments, an ensemble can only be tested by finitely many recording 
procedures, i.e. by finitely many geif and not by all geif). Likewise, the 
probability n(w, g) can be tested only with a finite error. Thus we see that the 

204 Uncertainty Principle and Foundations of Quantum Mechanics 


\fi(wi,gi)-fi(w 2 ,gt)\<e (i = l,2, ...n) 

may be tested experimentally only.for a finite number of effects g u . . . g„ and a 
finite error e. These inequalities determine (for various e, n, g,) a neighborhood 
base of the topology tr(3l', S£). 
To formulate the laws for preparing and recording we give some definitions: 


K o (B) = {w\weK,ti(w,g) = for all geB^L} 

K t (B) = {w | w g K, /t (w, g) = 1 for all g e B <= L} 

L o (A) = {g\geL,v(w,g) = for all we A cK} 

A" (.B) and K t (B) are closed faces of K, L (A) is a closed face of L. If B has 
only one element g, we write intead of K (B) simply -K (g) and similarly for K t 
and L - 

It may be easily seen that the ordering y x < y 2 in 58' is equivalent to the 

fi(w,y 1 )^fi(w,y 2 ) for all we A" 

Let us formulate the first law (concerning recording procedures) by the 

AV 1.1 To every pair g u g 2 eL there exists a g 3 eL such that g 3 >gi, g3>&> 
and K ( gl )nK (g 2 )<=K (g 3 ). 

AV 1. 1 is equivalent to the following statement: Every set L {A) has a greatest 
element (see Ludwig, 1970, and in preparation b) ; this greatest element may be 
denoted eL (A). 

AH elements of K (not only of X) will be called ensembles (or states), all 
elements of L (not only of i?) will be called effects. The elements of the form 
eL (A) will be called decision effects. The set of all decision effects will be 
denoted by G. 

AV 1.1 implies G c d e L (for proof see Ludwig, 1970 and in preparation b), 
d e L being the set of extremal points of L. 

Since, for a subset {A a } of 0>(M), the relation L (U« A J = f\ £o(A«) holds, 
the set {L (A) | A <= K} is a complete lattice, the order relation being given by 
the set theoretical inclusion. The mapping Lo(A)-»eL„(A) is an order 
isomorphism of {L (A ) \ A <= K} onto G ; hence also G is a complete lattice with 
respect to the ordering induced on G by the ordering of 58'. 

Let L be the a(3S', 33)-closure of {y | y = Ag, A 6 R, g e L and < Ag < 1}. As 
a second law (concerning again recording procedures), we postulate the axiom: 

AV 1 .2 g e L, e e G and X (e) c *C (g) implies K t (e) => ^(g). 

Let C(w) be the norm-closed face of K generated by w. Since ^ is separable, 
every norm-closed face of K is of the form C(w) with a suitably chosen w. 

Ludwig 205 

The next law (concerning both preparing and recording procedures) is given 
by the axiom: 

AV2 w u w 2 eK and C{w l )^C(w 2 ) imply that there is a g&L such that 
w 2 eK (g) but wiiK {g). 

AV 2 is equivalent to the relation: KoL (F) = F for every norm-closed face F 
The axioms AV 1.1, 2 and AV 2 imply 

L=L =[0,1] 

In the theory, we need the following axiom AVid which may be regarded as a 
mere mathematical idealization: 

AVid Ki(e)*0 for all eeG with e#0. 

AV id cannot be tested by experiments! AV id is equivalent to: (1 - e) e G for 
all eeG. 

The mapping e-*l-e is an orthocomplementation in the lattice G. The 
axioms AV 1.1 through AV id imply that G is an orthocomplemented, 
orthomodular lattice. 

We define the following 'distance' between two closed faces of K: 

A(C(w 2 ), C(w 3 )) = k inf {fi(w, e 3 )\we C(w 2 )} 
+3inf{ i u,(H',€ 2 )|H'eC(H'3)} 

e, being an abbreviation for eL C(Wi). Two faces C(w 2 ) and C(w 3 ) are called 
strictly separated if A(C(w 2 ), C(w 3 )) # 0. 
The next law (concerning preparing procedures) is given by the axiom: 

AV3 If w u w 2 , w 3 eK with C(w t ) <= C(w 3 ) <= C&Vi +jw 2 ) and if the faces 
C(w 2 ), C(w 3 ) are strictly separated, we have C(wi) = C(w 3 ). 

All of the axioms AVI. 1, AVI. 2, AV2, AVid and AV3 hold for all (!) 
known theories of physical systems, even for the so-called classical theories. 
The axiom distinguishing between 'classical' systems and m/crosystems is the 

AV 4 For every face C(w) of K there is a sequence w v e K such that C(wJ is 
of finite dimension, C(w u+1 ) => C{w v ) and 



It should be stressed that VV C(w v ) is not the set-theoretical union of the sets 
C(w„), but the smallest closed face of K which includes all C(w„). 
The concept of microsystem may now be defined as follows: 

Definition 6: If the axioms AV 1, AV 2, AV 3 and AV 4 hold, the set M 
endowed with the structure Si, 9t , 01 is called a set of microsystems. 

206 Uncertainty Principle and Foundations of Quantum Mechanics 

Since for the particular case of 'classical' systems the axiom AV 4 does not 
hold we will call axiom AV 4 the 'law of microsystems'. 'Classical' physical 
systems can be denned as such systems for which all (!) C{w) are infinite- 
dimensional and all decision effects are commensurable [for the concept of 
commensurability see Section 5(B); other equivalent forms of axioms for 
classical systems are given by H. Neumann (1972, 1974a, b) it has been 
demonstrated in these references how to regain the T-space from preparing 
and recording procedures.] . 

It can be proved (see Ludwig 1970, 1972b; Stolz 1971, and Ludwig, in 
preparation b), that the axioms AV 1, AV 2, AV id, AV 3 and AV 4 are 
equivalent to the following statement: 

K and L can be identified with the base of the Banach space $(#fi, X 2 , • • •) 
and with the order interval [0, 1] of the Banach space »'(*i, #2, • • •) respec- 
tively, «(#!, X 2 , ■ ■ •) being the space of all sequences (W 1 , W 2 , . . .). where 
every W t is a self-adjoint operator of the trace class in the Hilbert space % such 
that I-tr ({W 2 ) 1/2 )«x>. The dual Banach space 0t'(X u . . .) can be identified 
with the space of all sequences (A 1; A 2 , . . .), any A t being a self-adjoint and 
bounded operator in X, such that sup, ||A,|| < 00. The canonical bilinear form is 
given by: 

((Wu W 2 , . . .), (A lt A 2 , . . .)) = 1 tr (WAt) 


K is the set of all sequences ( W u W 2 , . . .) such that =£ W t and £, tr ( W t ) = 1 . L 
is the set of all sequences (F u F 2 , . . .) such that =£/=;•< 1. 

The axioms imply that the X, are Hilbert spaces over the fields R (of real 
numbers) or C (of complex numbers) or Q (of quaternions). There are physical 
arguments to eliminate the cases R and Q. 


The identification (given in Section 3) of K with the base of @(X U . . .) and of L 
with the order interval [0, 1] of #'(#i, • • •) makes ix possible to interpret the 
mappings </> and ./r (defined in Section 2, Definition 3) as mappings of St' into 
»(*!, . . .) and 9 into 3t'(X u . . .), respectively. <f&' is then norm-dense in the 
base K of »(X lt . . .) and $9 is *(»', $)-dense in L = [0, 1]. 

The norm-closed subspace of &'{X U . . .) generated by $9 was denoted by 3 
(see Section 3). A more detailed characterization of 3> by axioms cannot be 
given here (see Ludwig, in preparation a). Nevertheless, it should be mentioned 
that one may formulate axioms in such a way that 3) becomes a set of sequences 
(Ai, A 2 , . . .), A, being a self-adjoint operator of a certain C*-algebra s£ t of 
operators in X,. Thus 3) becomes the real part of a C*-algebra. We would like 
to call the attention of the reader to the fact that the 'set of states' in the theory 
of C*-algebras has been denoted (in Section 3) by K (and not by K, as is usually 
done in the theory of C* -algebras). 

Ludwig 207 

The set G of decision effects (as introduced in Section 3) may then be 
identified with the set of all sequences 

e = {E u E 2 , ■ ■ .) 

E t being a projection operator in X t . The special decision effects 

e, = (0, 0, . . . 1„ . . .) 

define the superselection rules. We shall refrain here from a detailed discus- 
sion. However, it is not difficult to see that it is very practical to investigate 
theoretically as well as experimentally the various 'sorts' of microsystems 
(characterized by the e,-'s) separately. Each sort is described in one particular 
Hilbert space. In this way, one obtains the 'usual' quantum mechanical 
formalism. We cannot do this here in detail. Instead we want to show that the 
sets St of preparing procedures and 9t of recording procedures may serve to 
elucidate some of the conceptions of 'usual' quantum mechanics. 

As a first example we treat a structure very similar to the famous Heisenberg 
uncertainty relation. 

As mentioned above, it is sufficient to discuss the case of one Hilbert space 
only. The following theorem holds (see Ludwig, 1970): There are two decision 
effects f?i and E 2 such that, for every ensemble W, at least one of the following 
inequality relations must be false: 

tr(W(£i-«iD 2 )^ 


tr{W{E 2 -a 2 l) 2 )^ 6 

a 1 = tr(WE 1 ), a 2 = tt(WE 2 ). 

This theorem is analogous to Heisenberg's uncertainty relation, formulated 
for decision effects. This theorem shows precisely, that the Heisenberg uncer- 
tainty relation has nothing to do with experimental errors (at least not in 
principle; see also the discussion in Ludwig, 1975), since in measuring the 
decision effects E x and E 2 only the two values one or zero may be obtained. 

We will simplify the following discussion by assuming <f>2t' = K and i]/2F=L 
(which is not essential). Then we can express the physical content of the above 
theorem as follows: 

There are two recording methods (i.e., it is possible to construct two 
recording apparatus) b^ and b™ with responses b a) and b m , respectively, such 
that ift(b#\ b^-Ex and iff(bo\ bo ) ) = E 2 are decision effects and such that, 
for every preparing procedure a, at least one of the two probabilities 

Ar(a n b$\ a n b m ), XAa n bf\ a n b (2) ) 

is essentially different from zero or one. One can make experimental efforts as 
strong as possible to construct preparing procedures, at least one of the 
recording procedures b m , b (2) will respond indeterministically even in case 
i(f(b%\ b m ) and tl>(b ( <?\ b (2) ) are decision effects. 

208 Uncertainty Principle and Foundations of Quantum Mechanics 

Thus we may conclude that the Heisenberg uncertainty relation is a relation 
which concerns merely the possibility of constructing preparing apparatus. All 
the more Heisenberg's uncertainty relation does not tell us anything about the 
possibility of 'simultaneous measurement'. We will have to come back to this 
problem in the next chapters. The vagueness in some discussions of Heisen- 
berg's uncertainty relation arises from the fact that in most cases there is no 
clearcut distinction between preparing and recording procedures, since only 
the so-called 'ideal' measuring processes of the 'first' kind are discussed. These 
particular measuring processes are recording and preparing procedures at the 
same time; therefore, the Heisenberg uncertainty relation for the preparing 
procedure forbids the simultaneous recording of the values of position and 


However, the concept of simultaneous measuring can be denned in a very 
natural way without any recourse to the so-called ideal measuring processes of 
the first kind. We will do this in Section 5(A). 


The concept of an observable can be defined in a very natural way starting from 
the experimental situation of recording, described by the terms $„ and 0t. 

(A) Coexistent recording procedures 

A pair (b b)e& was called an effect procedure. The mapping iff connects with 
each effect procedure (bo, b) an effect ^(b , b) e LC®Wn, ■ ■ • )• All discussions 
in Sections 3 and 4 were concerned with the mapping of one / = (b , b) onto one 
g = ^(/) = $(b , b) only. In reality, b represents in general an apparatus which 
has several possibilities of response, namely, all b with b <= b . Let us denote by 
m(b ) the set of all b with b <= b . The elements of the set m(b ) represent all 
recording procedures that are possible in applying the recording method b . In 
many experiments on microsystems, one uses methods b , where the set 0t(b o ) 
is so large that it is practically impossible to determine all b e ®(b ). 

This situation may be illustrated by the following two typical examples of 
recording apparatus. 


Each microsystem can produce responses of some of these counters. A 
recording procedure b can for instance be characterized by the answer of three 
particular fixed counters. In this case the set ®(b ) is very well known to 
technicians: ®{b is the Boolean algebra of switching between the various 

Ludwig 209 

2. The recording method b is a bubble chamber (or a cloud 

In this case the set of all b's is immense: every possible bubble corresponds to 
one b, but also every connected trace of bubbles corresponds to one b. 

The two examples demonstrate another feature of quantum mechanics: The 
various responses b e &(b ) of the apparatus b do not necessarily appear at the 
same time. In general a response b need not be instantaneous but may have a 
finite duration, for instance in the case where b represents the simultaneous 
response of two coupled counters responding with a time delay. Another 
example is a trace in a bubble chamber. 

These examples show that the 'simultaneous recording' of the various b's of 
ffl(b ) has nothing in common with 'measuring at the same time'. The fre- 
quently used formulation: Some observables as, for instance, position and 
momentum are 'not measurable at the same time' but at 'different times' is at 
least incomprehensible, if not false (see Ludwig, 1975). 

We define: 

Definition 7: The recording procedures b e&t(b ) are called coexistent with 
respect to the recording method b . Several (b , b)e& which have the same b 
are called coexistent effect procedures. 

(B) Coexistent effects 

If (b , b)isa family of coexistent effect procedures, then the set of the effects 
i//(b , b) is a subset of ^(b ). Therefore, we may define a mapping tf/ of0t(b o ) 
into L by 

Mb) = Hbo,b). 

It is not difficult to prove the following theorem: The mapping i/r is an additive 
and effective measure on the Boolean algebra 5?(Z> ), which maps the unit 
element of &t{b ) onto 1 e L. 
As an idealization we define: 

Definition 8: A set A el is called a set of coexistent effects if there is a 
Boolean algebra 1 endowed with an additive measure, F: 1-*L such that 

The essential conception of coexistent effects has been defined (though not 
yet in a clearcut way) in Ludwig (1964, 1967a, b, c, d). This conception is 
fundamental for the notion of observables. 

Definition 9: A set A <= G (i.e. a set of decision effects) is called a set of 
commensurable decision effects if there is a Boolean algebra 1 endowed with 
an additive measure F:1->G such that A czFI. 

210 Uncertainty Principle and Foundations of Quantum Mechanics 

It may be proved that every set of coexistent decision effects is also a set of 
commensurable decision effects and that the decision effects are commensura- 
ble if and only if the projection operators (in X t ) belonging to these decision 
effects commute. This last condition is very well known. However, the founda- 
tion of this condition is usually presented with much 'philosophy'. 

(C) Observables 

The notion of observable is nothing but an idealization of the correspondence 
0t(b o ) -*L. This idealization is obtained in a process of completion (see Ludwig 
1970 and in preparation a): If a Boolean algebra 2 is endowed with an additive 
and effective measure F: 2 -> L, then a metric may be defined on 2 by 

d(«r„ * 2 ) = fi(w , F{a x a af) + F{a 2 a af)) 

<r* being the complement of o~, w is an effective ensemble, for instance, 
w o = ^Kw v ,K>0, 2„A„ = 1, where the set {w v } is dense in K. 2 can be 
completed with respect to this metric (see Ludwig 1970 and in preparation a). 

Definition 10: A Boolean algebra 2 endowed with an additive measure 
F: 2 -*■ L is called an observable if 2 is complete and separable (with respect to 
the metric defined above). 

This general concept of an observable has been introduced and analysed in 
(Ludwig, 1970); a more detailed analysis is contained in (Neumann, H., 1971) 
and in (Ludwig, in preparation a). It is impossible to give a structural analysis of 
the concept of an observable in this short article. We only wanted to stress that 
the concept of an observable is no more than an idealization of recording 


To show at least the connection between the notion of an observable denned 
in Definition 10 and the 'customary' notion, we add the following definitions: 

Definition 1 1 : An observable is called a decision observable if the mapping F 
of definition is a mapping into G. 

Definition 12: A mapping R -^ 2 of the set R of real numbers into a Boolean 
algebra 2 is called a measuring scale of 2 if a t > a 2 implies o-(ari) > o-(a 2 ) and if 
cr(-oo) = 0, er(+oo) = e (e being the unit element of 2) and if the set of all cr{a) 
generates the whole algebra 2. 

A decision observable endowed with a measuring scale is identical with what 
is 'usually' called an observable. 


So far, in the discussions of quantum mechanics only the simultaneous 
measurability of decision observables has been of interest. However, the 

Ludwig 211 

question of the possibilities of simultaneous preparation was neglected or, at 
the most, discussed as a partial aspect of simultaneous measurability, since one 
had in mind only 'ideal' measuring processes 'of the first kind'. These 'ideal' 
measuring processes are, indeed, connected with a certain idealized form of 
'repreparing' processes (see Ludwig, 1972a, 1975). At any rate many funda- 
mental questions of the interpretation of quantum mechanics will become 
much more transparent if the discussion of simultaneous preparation is sepa- 
rated from that of simultaneous recording. Within the scheme of quantum 
mechanics as outlined here, a quite natural question arises: what is the 
condition for simultaneous preparation. 

Equation (6) combined with the identification of K with a subset of 
$($fi, . . .) implies the theorem: 

Let a = U"=i 0i be a demixture of the preparation procedure a, then 

<f>(a)= t A,<£(0,) 

holds with A, = A 2 (a, a,), 0<A, < 1 and £, A, = 1. 
If there are two demixtures of the same preparing procedure 


then we have 

i=i «c=i 

4>(a)= I A,<^(a,)= I A k <l>(dk) 
i=i fc=i 

In this case, also, these two demixtures generate a third one, namely: 

a = {J'( ai na k ), (9) 


where the union U' is taken over those pairs /, k for which a,niz k = 0. 
Expression (9) implies 

^(fl) = r.A^(fl,n4) 


where A, fc = A a (a, a, n^). 

It is very useful to introduce the following mapping <j> a : Sf(a) -*K;(K being 
the cap of the cone generated by K i.e. 

defined by the equation: 

K= U XK) 

<f> a (a) = \ i (a, d)(f>(a) 

It is not difficult to see that 4> a : y(a)—>K is an additive measure on the 
Boolean algebra y(a) with <f> a (a) = <f>(a) e K. 

212 Uncertainty Principle and Foundations of Quantum Mechanics 

Definition 13: An element w e K with w < w e K is called a mixture compo- 
nent, or shortly a component of w. 

If w is a component of w, then so is w-w; and w = >v + (w-w) is a 
demixture of h\ 

Two demixtures of one and the same preparing procedure 

n m 

a = U fli= U flfc 

.=1 t=i 

yield two demixtures of the ensemble <f>(a): 

n m 

.=1 fc=i 

for which the components <£„(«.•) and <Ma fc ) are elements of the image of Sf(a) 
under 4> a . That leads to the following definition: 

Definition 14: Two demixtures of an ensemble weK 

n m v 

w = I h\- = I »v fc (w„ w fc e A") 

are called coexistent if there is a Boolean algebra 2 endowed with an additive 
measure W : 2 -> £ such that W(e) = w (e being the unit element of 2) and w,-, 
w k eWL. 

Two demixtures of one and the same preparing procedure a give two 
coexistent demixtures of the ensemble </>(a). 

Definition 15: A set A <= K is called a set of coexistent components of w if 
there is a Boolean algebra 2 endowed with a measure W : 2 -* K such that 
W(e) = wandA^U r L. 

The set 

4> a na) = {<f> a (d)\deQ,d^a} 

is a set of coexistent components of 4>(a). 

If W : 2 -» AT is an additive and effective measure on the Boolean algebra 2, 

defines a metric in 2. 2 may be completed in this metric and the measure W 
may be defined uniquely on this completion. This leads to the following 
idealization of <f> a : SP(a) -*■ K: 

Definition 16: A Boolean algebra 2 endowed with an additive measure 
W: 2-»A: such that W(e)eK and 2 is complete and separable is called a 

Ludwig 213 

Let 2 be the completion of y(a) in the metric defined by the measure 4> a 

then 2.-*K is a preparator. Many experiments (especially in elementary 

particle physics) may be described by a structure of the form y(a) -4 K. This 
structure is also of paramount significance for a discussion of fundamental 
problems of quantum mechanics. We must refrain from a general mathematical 
structure-analysis of the concept of preparators [see for instance Ludwig, 
(1975) and in preparation a], rather, we will discuss some 'gedanken' experi- 
ments to elucidate the significance of preparators and coexistent and non- 
coexistent demixtures. 

First, we shall show that there exist non-coexistent demixtures. It is sufficient 
to prove this in the case of one Hilbert space X only. 

Let <f>, iff e X, ||4|| = M = 1, <t> -J-«A and 0< A < 1; then w = AP* + (1 -A)P„, is 
an ensemble. AP* + (1 -A)/% is a demixture of w, but it is easy to find others: 
Let x be another vector in the plane generated by <$> and i//. Then it is possible to 
choose a real number fi (0 < fi < 1) and a vector r\ e X such that 

is another demixture of the same w. Proof: x has the form 

x = <t>a + *b;\a\ 2 +\b\ 2 =l 
The conclusion is obtained by putting: 


M = 

A|6| 2 + (1-A)|a| 

17 = 


VAW + (1-A 2 )k| 

If a ¥> and b # 0, the two demixtures 

w = XP 4 , + (l-\)P^ = fiP x + {l-fi)P v 

cannot be coexistent as we will prove immediately: 

Let 2 be a Boolean algebra endowed with an additive measure W such that 
there are er 1( <x 2 e2 satisfying W{a l ) = kP 4> , W{a 2 ) = nP x . That implies 
W(<rf)=W(e)-W(<r 1 ) = w-AP* = (l-A)Pi and W(o- 2 < ) = (1- I jl)P v . We 
want to calculate W{a l h(r 2 ). One has W(at a a 2 ) ^ W(ai) = AP* and 
W^o-! a <r 2 ) =£ W(<r 2 ) = /mP x . Since v e K and O^v^Pj, 0<v<P x implies v = 
0, we get W(o-i a <r 2 ) = 0. In the same way, it follows that W(a 1 a o-f ) = 0, 
WV* a <r 2 ) = 0, and W(a\ i\<rf) = 0. Now e = {a 1 a <r 2 ) v la x a o-f) v 
(o-i a o- 2 ) v (o-f a af) implies w = W(e ) = WVi a er 2 ) + W(<n a o- J) + 

W-Vif a o- 2 )+ WV? a o-f) = in contradiction to w = AP* + (1 -A)P^ ^ 0. 

214 Uncertainty Principle and Foundations of Quantum Mechanics 


So far, one is accustomed to speak about individual microsystems and about the 
results of experiments with individual microsystems using common (not 
mathematical) language. On the contrary, mathematical language is employed 
in classical mechanics of mass points, where every individual mass point is 
represented by a mathematical trajectory which is then compared with the 
'real' trajectory obtained in measurement. Every proposition concerning an 
individual system may be formulated in mathematical form. The situation is 
quite different in customary quantum mechanics, where an individual experi- 
mental result, e.g. an individual trace in a cloud chamber, cannot be compared 
with the theory, since the usual quantum mechanical mathematical picture 
comprises only terms describing the statistics. 

For instance, it is not possible to translate into mathematical language such 
propositions as 'the position of this individual electron has been measured in 
the region V\ It is not possible to find a corresponding mathematical relation 
for this proposition. Only after a series of such measurements of the position 
have been carried out, the statistics of the results can be compared with the 
theory using the relation m =tr (WE T ). Here, E v is the following decision 
effect: the measured position is localized in V. 

One has tried out many ways of coping with this defect of quantum 
mechanics. Some approaches should be mentioned. 

Some physicists call the projection operators (or, in our terminology, the 
decision effects) properties of the individual microsystems, and formulate 
propositions such as: The microsystem x has the property e ; I know that x has 
the property e; x has the property 'not e'; x has not the property e; I do not 
know whether x has the property e ; x has the property e with the probability a ; 
the property e has been measured on x; etc. No wonder that one gets into 
difficulties in using this type of language. Some people have tried to avoid these 
difficulties in introducing a more precise form of this language and a new logic, 


Another attempt is interpreting the projection operators as propositions. In 
this case, the authors have to say which propositions (formulated in common 
language) should be represented symbolically by projection operators. Corre- 
spondingly some authors interpret the lattice of projection operators as a 
proposition-logic, the so-called quantum logic. This conception of quantum 
logic is to be distinguished from others, where the lattice of decision effects is 
formally called 'quantum logic', however, without claiming that the word 
'quantum logic' should have anything in common with a logic of propositions. 

We want to show that every experimental result (also if concerning an 
individual microsystem) can be described by a mathematical relation if one 
accepts the foundations of quantum mechanics outlined in the first sections of 
this paper. This description is possible without employing any new form of logic 
and using only the usual logic of mathematics. 

Ludwig 215 

It is not possible to give here a complete survey of all possible ways of 
formulating propositions on individual microsystems in mathematical lan- 
guage, only some examples may briefly be sketched. To avoid mathematical 
difficulties connected with the fact that <f>&' is only dense in K and tySF only 
dense in L, we will sharpen our axioms in postulating: 4>$L' = K and i//&=L. 
This sharpening is without any fundamental physical relevance, since every 
comparison between theory and experiment can only be made with a finite 
inaccuracy (anyway, this sharpening, may be avoided in using more sophisti- 
cated mathematics). 

For the following discussions it is essential to realize that the physical 
interpretation of l x e a, a e ,2" and of 'x e b, b e 0f has already been given in 
Section 1. All other propositions may be deduced from these two fundamental 
relations. For instance, a proposition of the form 'the microsystem x has the 
property e' makes no sense until one has given this proposition a mathematical 
form which is reducible to relations of the form x e a, a € St ' and x e b, b e £%. No 
intuitive meaning of properties is introduced. Every interpretation must be in 
the last line reducible to the interpretation of preparing and recording proce- 

We define the following sets: 

Definition 17: For every e e G let 

St(e) = {a\a e St', and <^(a)e ^(e)} 

MJe)= U a 


We want to discuss the following relation: 

xeM p (e) 


The physical interpretation of (11) is clear by Definition 17, since e is defined 
by preparing and recording procedures as has been done in Section 2, and 
M p (e) is the union of some preparing procedures. However, one usually wants 
to express relation (11) by a short formulation in common language. To this 
end, we introduce the following terminology: The elements e e G will be called 
pseudo -properties (not simply properties — to avoid misunderstandings). We 
express the relation (11) in the form: 'the microsystem x has been prepared 
with the pseudo-property e\ 

Relation (11), formulated for microsystems, may also be formulated for 
macrosystems and, in this context, it represents a very interesting technical 
procedure. For instance, for steel balls manufactured by a machine, it reads: 
'this individual steel ball x has the property e, i.e., that the radius lies in the 
interval [r u r 2 ]'. 

Relation (1 1) is a proposition on a pseudo-property of the prepared micro- 
system x. No misunderstanding is possible if we use relation (11) for such 
propositions. The logical negation of (11) is xf£M p (e). There is another 
relation: xeMpie' 1 ), e i= l-e. However, the two relations x£M p (e) and 

216 Uncertainty Principle and Foundations of Quantum Mechanics 

xeM p (e"-) are not equivalent! This fact has nothing to do with logic, but only 
reflects the structure of the family of sets {M p (e), eeG}. 

Another possible proposition on an individual microsystem may be obtained 
using the following definition: 

Definition 18: For every eeG.we define: 

91 (e) = {b \b e 0i and there is a b e @, such that b => b and il/(b ,b) < e}; 

Mr(e)= U b 


We want to discuss the relation 

xeM r (e) 


We express (12) in common language by the proposition: 'The pseudo- 
property e of the microsystem x has been recorded'. 
We define: 

Definition 19: M(e) =M p (e)uM r (e) 
The mathematical relation 



is equivalent to 

xeM r (e) or xeM p (e) 

We express (13) in normal language by: 'The microsystem x has the 
pseudo-property e\ It seems again in order to stress that the two relations 
x 9k M(e) and x e M(e^) are not equivalent. 

At the beginning of Section 7, we mentioned some of the propositions 
usually used in the interpretation of quantum mechanics. These propositions 
have the disadvantage that it is often unclear what they mean, and various 
authors may give them various meanings. This is not possible while using 
relations (11), (12) and (13). 

What about a new logic to treat the relations (11), (12) and (13)! 

At first, it is to be stressed that in 'handling' mathematically relations (11), 
(12) and (13), one has to use customary mathematical logic! What does this 
imply? We will demonstrate this only by examples. 

Let, in the following, x be an individual microsystem, we are experimenting 
with. Let e be a well-defined element of G, for instance, the following decision 
effect: 'the position is in a certain space-region V. We will now discuss some 
possibilities of describing what we have done in experimenting with the 
microsystem x. 

(1). We have shown in Ludwig (1970, 1975) that experimental results and 
procedures may be written down in mathematical form. These relations 
representing the experiments are, from the point of view of mathematical 

Ludwig 217 

theory, new axioms denoted in Ludwig (1970, 1975) by ( — ) r . We assume that 
these axioms ( — ) r are added to the mathematical theory and that, for instance, 
(13) is a theorem in this stronger mathematical theory. If this is the case, one 
says for instance: I know that x has the (pseudo-)property e. The words 'I know' 
could be misunderstood in the following way: someone could mean that my 
subjective knowledge is essential here. However, what we, indeed, mean is that 
the real experimental situation (expressed by the axioms ( — ) r together with the 
theory) make it possible to deduce (13) as a theorem. In the case where (13) is 
deducible as a theorem, also another sentence is used: The proposition (13) is 
'true'. If one uses these words 'I know' or 'true' only as abbreviations for the 
fact that (13) is deducible as a theorem, there is no objection and no real 
mistakes are possible. 

(2). For instance, relation (12) is no theorem, however, it may be added 
without contradiction as an axiom to the theory complemented by the axioms 
( — ) r . In this case, one may say: It is possible (but not necessary) to record the 
pseudo-property e on the microsystem x. Or: The pseudo-property e can be 
recorded on x. It is exactly this situation which, in the opinion of some authors, 
may be described only by a new logic. In fact, any mathematician knows that a 
relation like (12) may be added to a theory without any contradiction, although 
(12) is not a theorem of the theory. In this case, it is possible to add (without any 
contradiction) also the negation of the relation (12): x€ M r {e). If one wants to 
interpret the word 'true' by 'deducible as a theorem' (see above), then, in this 
sense, mathematical logic is 'many-valued' since the very beginning of 
mathematics. In a mathematical theory a relation A can be a theorem, or the 
relation (not A) can be a theorem, or (as a third possibility) neither A nor (not 
A) are theorems, but A as well as (not A) can be added as axioms without any 
contradiction (naturally, not both of these relations!). However, no new logical 
axioms different from the usual mathematical axioms are necessary to discuss 
quantum mechanics. 

(3). The negation of relation (12) is a theorem in the theory complemented 
by the experimental results ( — ) r . One says: It is impossible that x could be 
recorded with the pseudo-property e. It may be stressed again that x£M r (e) 
does not imply xeM r {e^)\ 

After having interpreted words such as T know', 'true', 'possible', 'impossi- 
ble', we want to discuss in more detail example (2). We assume, in particular, 
that the microsystem x has been prepared experimentally by a procedure a 
(but not yet recorded) and that the well-known construction of the preparing 
apparatus makes it possible to determine the element <f>{a) of K. Under this 
assumption, relation (12) may be added as an axiom without any contradiction 
if and only if fi(<f>(a), e) 5* 0. 

If this is the case, one sometimes states that relation (12) is true with 
probability n,(<f>(a), e). However, this statement could be misunderstood. In 
fact, the question as to the 'possibilities' of (12) is very complicated. Therefore, 
we ask: How is it possible to 'realize' the relation (12)? We ask for an 

218 Uncertainty Principle and Foundations of Quantum Mechanics 

experimental possibility of a recording procedure, such that after writing down 
the experimental results of recording (in the form of axioms (—),.), relation (12) 
will appear as a theorem. However, this experimental possibility does not only 
depend on the probability, it also depends on a certain 'arbitrary choice'. 

Arbitrary is here the selection of the recording method b . xeb can be 
realized for many different recording methods b . It is up to the experimentalist 
which method b he will apply. If a certain apparatus b has been installed to be 
used in the experiment, the relation xeb (as one of the axioms of (— ) r ) is to be 
added to the theory. Now we have a situation very typical for microsystems: 
By the selection of a certain b , i.e. by writing down xeb for a certain b , the 
'possibilities' have changed essentially. By b i.e. by ip ■ ®(bo) -> L an observ- 
able is defined. It may be that the decision effect e in (12) is not coexistent with 
all effects ${b , b). In particular, b can be chosen in such a manner that the 
following theorem holds: 
There is no be 91 such that b * , b c b , and if/(b , b)^e. 
If x e a and x e b is fixed by the experiment and if the theorem above holds 
for b, then relation (12) is false, i.e. xtM r (e) is a theorem in the theory with 
axioms x<=a,xeb . We say: it is impossible that x could be recorded with the 
pseudo-property e. 

However, if one has not yet fixed the recording method b , one is free to 
choose another one. For instance, it is possible to choose a recording method b 
such that there exists a b e m such that b <= b and i/r(b„ b) ^ e. Now suppose we 
have chosen such a b experimentally. What can we say then about the 
'possibility' of relation (12)? 

It is clear that (12) may be added as an axiom without contradicting the 
theory containing the axioms, xeaandjte b , if there is a b e 5? (b ) such that 
tlf(b Q ,b)^e and ii(4>(a),^bo,b))^0. If there are b lt b 2 e0t(bo) such that 
<i{bo, bi) ^ e, iff(bo, b 2 ) ^ e, then also therelation 4>(b , b x ub 2 )<e holds, which 
can be simply proved: b 1 ub 2 = [b 1 n (b \b 2 )] u b 2 implies 

One has: 

i{/(bo, AiuW = <A(£o, br n (b \b 2 )) + tf>(b , b 2 ) 

g t = tl/(b , bi n (b \b 2 )) < t(t(b , b{)<e 

and g 2 = <lf(b , b 2 ) < e. If we write g = <K£>, &i u b 2 ), we get g = gi + g 2 . This 
equation and the relations gj < e, g 2 < e imply K (g) => K (e), which is equival- 
ent togse. 

To avoid measure-theoretical arguments, we will assume that the union of all 
elements, b of 9t(b ) such that 4t(b , V)<e, is an element b of &(b ). b is the 
greatest element of &(b ) satisfying ilr(b ,b)^e. Therefore, if b has been 
chosen as the recording method, we say: relation (12) is possible with probabil- 
ity n(4> (a), iff (b , b)). If b has been chosen in such a way that iff(b , h) = e, the 
probability is equal to the greatest possible value fi(<p(a), e). 

We now assume that the recording of the system x has been accomplished 
and that our b has given a response. Then the relation x e b must be added to 

Ludwig 219 

the theory as an axiom, i.e. as a mathematical formulation of the experimental 
result. In this more comprehensive theory, relation (12) is a theorem; we say: 
'the microsystem x has been recorded with the pseudo-property e'. Since (13) 
is a theorem, too, we also say after this experiment has been performed: 'The 
microsystem x has the pseudo-property e\ 

However, if b has not responded, i.e. if b \b has given a response, then 
x e b \b is to be added as an axiom. This implies that the relation x g M r (e) is a 
theorem; we say: 'it is impossible that the system x has been recorded with the 
pseudo-property e\ But x e M(e) can be a theorem if i//(a) e K x (e)\ 

Some authors are using intuitively the proposition 'x has the property e' as 
equivalent to relation (13). To others, this proposition seems inadmissible. We 
will see why. 

Let «!, e 2 be two incommensurable decision effects such that M(ei)n 
M(e 2 ) # (for instance, let e x be the decision effect: the position in a very 
small space-region V, and e 2 : the momentum is in a very small region II of the 
momentum space). 

M(ei)riM(e 2 ) 5* implies that the relation 

x e M(e x ) and x e M(e 2 ) 


could be an allowed hypothesis (see Ludwig 1974b, and in preparation b), i.e. 
that in the case of a suitable experimental situation relation (14) may be added 
to the theory without any contradiction. It may also be that the experimental 
situation is such as to admit relation (14) as a theorem. 

At first sight, relation (14) seems to contradict the well-known quantum- 
mechanical laws, for instance, the Heisenberg uncertainty relation, if the 
regions denoted above by V and II are small enough. Relation (14) expressed 
in common language would read: The microsystem x has the pseudo-property 
ei (for instance, a position in V) and the pseudo-property e 2 (for instance, a 
momentum in II). Such a sentence in common language could, indeed, seem to 
be in contradiction to quantum mechanical laws. However, in reality, there is 
no contradiction. We may clear up this problem in formulating it in mathemati- 
cal symbols. Relation (14) is, for instance, a theorem if the experiment is of such 
a type that xea, <b(a)e K^ex), xeb, b being such that there is a b => b and 
ilf(b ,b)<e 2 . 

Another objection to accepting the proposition 'x has the pseudo-property 
e' as a translation of (13) into common language is the following: Let aeSbe 
such that <f>(a) e K x (e) and xea. Then (13) is a theorem, even if the recording 
procedure b is such that ij/(b , b) is not coexistent with e. After (!) the recording 
method b had been employed, the microsystem x was influenced by the 
apparatus b in such a way that the microsystem does not have the pseudo- 
property e any more. This objection is not correct. It is true that there was an 
influence of the recording apparatus on the microsystem. However, it is not 
essential for the interpretation of quantum mechanics to know what happens 
after the recording procedure. Only the correlation between preparing and 
recording are essential. Why do many physicists believe that a measurement 

220 Uncertainty Principle and Foundations of Quantum Mechanics 

(we called it a recording procedure) yields an information on the system after (!) 
measurement? Only because they tacitly assume the so-called 'ideal measure- 
ment of the first kind'! However, in general, the recording is much more 
complicated, considering the influence of the recording apparatus on the 
microsystem see Ludwig (1972a, 1975). In our interpretation of quantum 
mechanics, the recording response yields an information on a microsystem such 
as it was before the recording process has materialized. The essential relations 
x e a (a e St') and x e b (b € 0t) are relations which give an information on the 
system x after preparing and before recording. The probability /x, (<f>(a), iff(b , b)) 
gives the correlation between preparing and recording processes, a correlation 
due to the so-called microsystem conceived as a system between preparing and 

Instead of our description of the basic quantum-mechanical processes, given 
by the structure St, & , ®> a lot of different postulates for the measuring process 
are considered to be fundamental by other authors. The majority of these 
postulates have their origin in the famous postulate (M) in J. v. Neumann's 
book Mathematical Foundation of Quantum Mechanics (1955). This postulate 
(M) reads: 

(M.) If the physical quantity, R, is measured twice in succession in a system, S, then we get the 
same value each time. This is the case even though R has a dispersion in the original state of S, and 
the .R -measurement can change the state of S. 

V. Neumann's postulate (M) was the source of the widespread opinion as if it 
were essential, in a measurement, to fix the state after the measurement. The 
existence of such measurements of the first kind, as they are postulated by (M), 
is not necessary for the interpretation of quantum mechanics; the structure Si, 
m Q , 0t is sufficient, moreover this structure St, 0t o , 0t seems to represent a more 
natural basis for quantum mechanics considered as a theory describing 
microsystems between preparing and recording. 

The description of a microsystem as a real entity between preparing and 
recording should not be taken as a hint to endow such microsystems with 
additional intuitively motivated structures. Such additional (read: not deduci- 
ble from St, 9t , 0t) structures could eventually prove to contradict quantum 
mechanics [see, for instance, the theories of hidden variables, see also Ludwig 

The examples (11) through (14) should suffice to demonstrate that, on the 
conceptual basis given in Section 2, it is possible to speak of individual 
microsystems without any contradiction. 


We will discuss only the spin example of the Einstein-Podolski-Rosen paradox 
(in the following abbreviated as the EPR paradox). 

Ludwig 221 

Pairs of particles may be prepared in such a way that the total spin is equal to 
zero; the spin of each single particle of the pair may be equal to 1/2. 

To describe the EPR paradox, one needs three Hilbert spaces, %6 t : Let $fi be 
the Hilbert space of the particles of sort 1, $f 2 the Hilbert space of the particles 
of sort 2 and $f 3 the Hilbert space of the pairs (1, 2). To simplify the problem we 
shall consider only the spin spaces, i.e. %Cx and $f 2 are both two-dimensional 

Of QM> w qu> 

(flS — <fv\ <*■ efV2 

We install a preparing apparatus a which prepares only pairs of total spin 
zero. The ensemble 4>(a) is described by a statistical operator in $f 3 = 5Jf x x $f 2 . 
We have <j>(a) = P x , x being the vector 

= -^=(h + (1)«-(2)-«-(1)« + (2)) 

We have denoted the eigenvectors of the 3-component of the spin of particle 1 
by u+(l), «_(1), and the corresponding eigenvectors of particle 2 by u+(2), 

Let 6o 2> be a recording method that measures the 3-component of particles 2. 
Then, there is a recording procedure Z> <2) <= b™ such that ip(bo\ b (2) ) = 1 x P u+ , 
and ilf(b$\ ^ 2) \6 (2) ) = 1 xP u _ (as operators in X 3 ). 

Now we want to construct a new preparing apparatus composed of the 
apparatus a and the apparatus b™. We get three new preparing procedures, 
preparing particles of sort 1. 

We do this in the following way (see Figure 1): The pairs prepared by a are 
leaving the preparing apparatus in such a manner that the particles of sort 1 and 


Figure 1 Preparing apparatus composed of apparatus a and b™ 

2 leave the apparatus in opposite directions. Particles of sort 2 are entering the 
recording apparatus b™. The particles of sort 1 are leaving the new apparatus 
composed of a and b™. The apparatus composed of a and b™ gives us three 
preparing procedures: The preparing procedure a\ comprises all particles of 
sort 1, leaving a. The preparing procedure a\ + comprises all particles of sort 1 
such that b (2) has given a response. The preparing procedure a\~ comprises all 
particles of sort 1 such that Z»o 2> \^ <2) has given a response. 

222 Uncertainty Principle and Foundations of Quantum Mechanics 

Apparently, the following relations hold: 

^ 1 + 3- 

a\ + na\ 

= 0, 

i.e. a\ = a\ + u a\~ is a demixture of a\. <f>(a\), <t>{a\ + ), <f>(al ) are operators in 

<j>(a\) = W=\\, <f>(al + ) = P u _, 4>(a\-) = P u+ 


\ & (a\,a\ + ) = k A {a\,a\') = \ 

the demixture a\ = a\ + \ja\~ leads (according to Section 6(A)) to the demix- 
ture of the ensemble W: W=<f>(al) = l<f>(al + )+l<l>(a 3 l ~), i.e. 

W=\P U _+\P U+ 

It is not difficult to construct many more preparing procedures in a similar 
way. We combine the preparing apparatus a with a recording apparatus b 
that measures the 1 -component of the spin of particles of sort 2. Then we get 
three preparing procedures a\, a\ + , a\~ such that <£(« i + ) =P V -, 4>(&\ ) = Pv+, 
v+ and v- being the eigenvectors of the 1-component of the spin. a^ = 
a \ + u a \" is a demixture of a \, which implies 

<l>(aX)=W=\\ = \P v MP v+ 

Since <j> {a \) = 4>(a \), we have obtained two different demixtures of the same W. 
Since the two preparing procedures, a\ and a\, differ only in the measuring 
parts, one could mean that we should postulate: a\ = a\. However, although 
the postulate a \ = a \ seems very 'plausible', at first sight, it leads to a contradic- 
tion. Indeed, we will prove: a\r\a\=0, which contradicts the postulate 
a\ = a\, since a\, a\&SL', i.e. a\* 0, a\*0. Hence a paradox (i.e. a con- 
tradiction) arises if one attempts to add to the theory the (perhaps plausible) 
axiom: a\ = ai. 

To prove a\ n a\ = 0, we start with the relation 

a\na\ = {a\ + na\ + )u{a\ + na\-)u(a\- na\ + )u{a\- na\-) 

Since 3. is a selection structure, a{ + na\ + &£ holds. If a\ + r\a\ + £&' (i.e. 
a\ + na\ + 7 i were true, <f>(a{ + na\ + ) would be an ensemble. 
The two demixtures 

a 3 + = (a 3 +na i +)ua -3 + and a {+ = ( a \ + na\ + )KjaY 


dl + = al + \(al + na\ + ), 

d\ + = a\ + \(al + nal + ) 

imply the demixtures 

^(aD = A^(a? + nal + ) + (l-A)^(aD 


Ludwig 223 

Since <f>{a\ + ) = P U _ and <(>(a\ + )P v _, the demixtures take on the form: 

P u „ = kW' + (l-\)W" and P v - = fiW' + a-fi) W" 

a\ + r\a\ + ¥^0 would imply A # and fi ¥= 0. Since P u _, P v _ are extreme points, 
A # and /j.^0 would have the consequence: W = P u _ and W = P„_, which is 
a contradiction. Thus a\ + n a \ + = is proved. 

In a similar way, a \ + n a \~ = , etc, can be proved. Thus also a \ n a J = is 

It is a consequence of a i n a 1 = that the two demixtures W = §P„_ + iP« + = 
\P V _+\P V + of W=\\ are not coexistent. 

The example, <^(ai) = <^(al), shows an essential feature of quantum 
mechanics, namely, the fact that something of the structure of the preparing 
procedures gets lost by the mapping </> of SI' on to K. The introduction of the 
structure SL, 0t o , 01 as a basis for the interpretation of quantum mechanics is 
essential to clear up all problems of the interpretation. 

The example above is an 'extreme' case of two non-coexistent preparing 
procedures. We define: 

Definition 20: Two demixtures of an ensemble w e K 

i k 

(\Vi,w k eK) are called complementary if the assumption <p(a) = <p(d) = w 
together with two demixtures a=\_} i a i , d = [J k d k , such that <p a (Oi) = Wh 
<Pd («fc) = w k , implies and = 0. 
Example: the two demixtures 

W = \l = \P u MPu + = \PvMP» + 

are complementary. 

Definition 20 may be generalized in such a form that only abstract Boolean 
algebras (instead of 2(a), SL (a)) are used (see Ludwig, in preparation b). 

Another possibility of formulating the intuitive idea that a\ and a\ should 
not 'essentially' differ would be the following: 

It is true that not every preparing procedure aeS.' can be demixed 'arbitrar- 
ily' by constructing a corresponding apparatus. However, one might imagine 
that, ideally, there should exist much more and finer 'preparing procedures' 
than those of Si' . For instance, it is conceivable that procedure a could be 
demixed in such a way that the spin-3-components of the components of the 
demixture are well-determined. 

According to this point of view, we shall now attempt to add new axioms to 
the theory: Let SI be the set of all 'imagined preparing procedures'. We 
postulate I =>£ and all the axioms APS 1 through APS 7 also for the set §L. 
naturally, it would make no sense to postulate the other axioms AV 1 through 
AV 4 also for SI, since we assume that there are possibly more imagined 
preparing procedures than those of SL. In the same way as SL, SI also splits into 

224 Uncertainty Principle and Foundations of Quantum Mechanics 

equivalence classes. Since the effect procedures / e & are 'tested' with the help 
of a larger set of preparing procedures, namely, those of 2, the partition of ^ 
into equivalence classes with respect to 2 will, in general, be finer than that of & 
with respect to 2. 

According to the idea that the elements of G are symbols for some properties 
which the microsystems are endowed with, we want to express that an fe &, 
such that i}/(f)&G, characterizes only the fact that the microsystem has been 
found to 'have' the property 4>(f). Therefore, we postulate the axiom: 

M(fl,/i) = l, ael 
(not only az2\), and 

fffifi) = «A(/i) g G implies n (a, / 2 ) = 1 

In this axiom, we express the idea that the difference between the two effect 
procedures, f u f 2 , is not essential if the microsystems recorded by these 
procedures 'have' the property i//(fi) = *l>(f 2 ). K an 'imagined preparing proce- 
dure' involves only systems endowed with the property e = t(t(fi) = ^(A), then 
the probability must be one for f u as well as for f 2 . 

All axioms introduced for the set 2 are satisfied if one puts 2=2. They 
cannot lead to any contradiction. The essential difference between 2 and 2 
shall be a much wider possibility to demix preparing procedures. According to 
the experiments of the EPR-paradox the following axiom seems to be very 

To every demixture of a recording method, b e 0l o , in the form 

b = b 1 'ub 2 vb 3 

(i.e. b t e 01 and b t n b k = if iV k), such that 

^(b ,b 1 ) = e 1 eG, tl/(b ,b 2 ) = e 2 eG, tf>(b ,b 3 ) = e 3 eG 

hold there exists, to every^ae^' (not necessarily a &§.'), a demixture a = 
a"iU0 2 u a 3 such that a, e 2 and fi(d h (b , 6,)) = 1 (if a, ^0). 

This 'demixing' axiom means the following: every set of a microsystems 
prepared in a 'normal' preparing procedure a may be thought to be demixed 
with respect to every triplet of decision effects e u e 2 , e 3 , satisfying e t + e 2 + e 3 = 
1, in such a manner that the microsystems of the components a, 'have' the same 
property e,-. 

The axioms introduced above allow us to define a mapping $ of G into $>(M) 

*(«) = U a 


where 2~ x {e) = {a\az£' and n(a,f) = l if <ft(f) = e}. It may be proved (see 
Ludwig, 1975): 

(1). e x < e 2 implies <b(e x ) <= <J>(e 2 ); 
(2). <fr(e)n<D(<; x ) = 0; 

Ludwig 225 

(3). e, e G and e x + e 2 + e 3 = 1 implies 

<&(ei) u $(e 2 ) u ^(e 3 ) = M 


These relations lead to a contradiction (as can be proved by the methods 
given in (Bell, 1966; see also Ludwig, 1975). Therefore, the idea of a set I of 
'imagined preparing procedures', such that the 'demixing axiom' holds, is 
forbidden, even though this idea might seem very plausible. 


We have presented some examples to demonstrate the following: 

(1). How one may work with the theory founded in Sections 1 and 2. 

(2). How to make precise theoretical propositions concerning individual 

(3). How to formulate additional 'imagined' structures by axioms — with the 
intention to test these additional structures on their compatibility with the 

We have shown that there is no difference between classical systems and 
microsystems as to the formulation of propositions concerning individual 
systems. The reason for the difference between classical systems and microsys- 
tems is only a different form of the statistical laws, due to axiom AV 4. 


Thanks are due to Professor Jenc, Dr. Kanthack, Professor Melsheimer and 
Professor Neumann for critical reading of the manuscript and numerous 
improvements in the text. 


Bell, J. S. (1966) 'On the problem of hidden variables in quantum mechanics', Rev. Mod. Phys., 38, 

3, AM-AS1. 
Ludwig, G. (1964) 'Versuch einer axiomatischen Grundlegung der Quantenmechanik und all- 

gemeinerer physikalischer Theorien', Z. Physik, 181, 233-260. 
Ludwig, G. (1967a) 'An axiomatic foundation of quantum mechanics on a nonsubjective basis', in 

Quantum Theory and Reality, Springer, Berlin, pp. 98-104. 
Ludwig, G. (1967b) "Attempt of an axiomatic foundation of quantum mechanics and more general 

theories II', Commun. Math. Phys., 4, 331-348. 
Ludwig, G. (1967c) 'Hauptsatze des Messens als Grundlage der Hilbert raumstruktur der 

Quantenmechanik', Z. Naturforsch., 22a, 1303-1323. 
Ludwig, G. (1967d) 'Ein weiterer Hauptsatz des Messens als Grundlage der Hilbertraumstruktur 

der Quantenmechanik', Z. Naturforsch., 22a, 1324-1327. 
Ludwig, G. (1968) 'Attempt of an axiomatic foundation of quantum mechanics and more general 

theories III', Commun. Math. Phys., 9, 1-12. 

226 Uncertainty Principle and Foundations of Quantum Mechanics 

Ludwig G (1970) 'Deutung des Begriffs "Physikalische Theorie und axiomatische Grundlegung 
der Hilbertraumstruktur der Quantenmechanik durch Hauptsatze des Messens', Lecture Notes 
in Physics, 4, Springer, Berlin. . , 

Ludwig G (1971a) 'The measuring process and an axiomatic foundation of quantum mechanics , 
Foundation of Quantum Mechanics, B. D'Espagnat ed., Academic Press, New York, pp. 

Ludwig, G. (1971b) 'A physical interpretation of an axiom within an axiomatic approach to 

quantum mechanics and a new formulation of this axiom as a general covering condition', Notes 

in Math. Phys., 1, Marburg. 
Ludwig, G (1972a) 'MeB- und Praparierprozesse', Notes in Math. Phys., 6, Marburg. 
Ludwigi G. (1972b) 'An improved formulation of some theorems and axioms in the axiomatic 

foundation of the Hilbert space structure of quantum mechanics', Commun. Math. Phys., 26, 

Ludwig, G. (1974a) 'Measuring and preparing processes', Lecture Notes in Physics, 29, Springer, 

Berlin, pp. 122-162. 
Ludwig, G. (1974b) Einfuhrung in die Grundlagen der Theoretischen Physik, Vol. 1, Bertelsmann, 

Ludwig, G. (1975) Einfuhrung in die Grundlagen der Theoretischen Physik, Vol. 3, Bertelsmann- 

Vieweg, Diisseldorf- Wiesbaden. 
Ludwig, G. (in preparation a) Fundaments of Quantum Mechanics, Springer, Berlin, 2nd ed. of Die 

Grundlagen der Quantenmechanik (1954). 
Ludwig, G. (in preparation b) 2nd ed. of Ludwig (1970). 
Nagel, R. J. (1974) 'Order unit and base norm spaces', Lecture Notes in Physics, Vol. 29, Springer, 

Berlin, pp. 23-29. . , 

Neumann, H. (1971) 'Classical systems and observables in quantum mechanics , Commun. Mam. 

Phys., 23, 100-116. . . 

Neumann, H. (1972) 'Classical systems in quantum mechanics and their representation in 

topological spaces', Notes in Math. Phys., 10, Marburg. 
Neumann, H. (1974a) 'On the representation of classical systems', Lecture Notes in Physics, Vol. 

29, Springer, Berlin, pp. 316-321. ? 

Neumann, H. (1974b) 'A new physical characterization of classical systems in quantum mechanics , 

Int. J. Theoret. Phys., 9, 225-228. 
v. Neumann, J. (1955) Mathematical Foundation of Quantum Mechanics, Princeton University 

Stolz, P. (1971) 'Attempt of an axiomatic approach of quantum mechanics and more general 

theories IV, Commun. Math. Phys., 23, 1 17-126. 

Quantum Mechanics of Bounded Operators 


Institute of Mathematical Sciences, Madras, India 


Heisenberg (1925) started the golden age of modern quantum mechanics. The 
essence of his discovery has been in the identification of physical observables in 
terms of Hermitian matrices (operators) leading to the fact that operators 
corresponding to canonical variables do not commute. The 'uncertainty' in the 
simultaneous measurement of canonical variables is then a simple manifesta- 
tion of this non-commuting behaviour of the corresponding Hermitian 
operators. Weyl (1931) rewrote the Heisenberg commutation relation in an 
exponential form leading to a certain nilpotent Lie group. Von Neumann 
(1931) and Stone (1930) simultaneously solved the problem of uniqueness for 
Weyl commutation relations. The Weyl group has a single irreducible rep- 
resentation which is precisely the Schrodinger representation. The complete 
equivalence of the Heisenberg and the Weyl commutation relations has been 
established (Rellich, 1946; Dixmier, 1958; Nelson, 1959; Carrier, 1966). The 
canonical commutation relation (CCR) of Heisenberg implies that the Hilbert 
space on which these operators act is infinite dimensional and that one (or both) 
of these operators should be necessarily unbounded. A natural question arises 
whether one can write an analogue of CCR for operators with a discrete 
bounded spectrum and acting on a finite dimensional vector space. The answer 
is yes. To achieve this, we start with the Weyl form (Weyl, 1931) of the 
representations of the Abelian group of unitary rotations in ray space. We 
define the canonical generators (Hermitian) as the logarithms of the finite 
dimensional unitary rotations. We then compute explicitly the commutator of 
these generators. It is naturally trace free. Also, we demonstrate that in the 
limit of continuous spectrum, valid as the dimension goes to infinity, the new 
commutator reduces to the standard CCR. Thus, our relation is the correct 
discrete analogue of CCR and it is unique if one starts with the Weyl form. We 
elevate the commutator for bounded operators with discrete spectrum to what 
we call quantum mechanics on discrete space (QMDS) (Santhanam and 
Tekumalla, 1975). 

We show that the angular momentum operators satisfy QMDS with the 
corresponding phases. To demonstrate this, we reformulate the theory of 


228 Uncertainty Principle and Foundations of Quantum Mechanics 

angular momentum by quantising the azimuthal direction instead of the 
zenithal angle (Levy-Leblond, 1973). 

It turns out that there is no 'uncertainty' in the measurement of canonical 
variables (canonical in the sense of QMDS). The uncertainty in the measure- 
ment of usual canonical variables (CCR), in our formulation, is a manifestation 
of the continuous nature of the spectrum. QMDS is, however, distinct from the 
classical theory since the operators do not commute. 

There is an approach due to Schwinger (1960) to discuss quantum mechanics 
in finite dimensions and eventually take suitable limits to get the usual one. We 
will make use of this technique. 

Besides, the representation theory of generalized Clifford algebras has been 
studied by Ramakrishnan and coworkers (1969a, b). We shall make use of 
some of their results. 


Suppose A and B are two elements of the Abelian group of unitary rotations in 
ray space so that 

AB = eBA (1) 

where e is the primitive nth root of unity. By iteration we have 

A k B' = e kl B l A k , jfe,/ = 0,l,...,n-l (2) 

From this equation it follows that A " commutes with B and B" commutes with 
A and if the Abelian rotation group is irreducible it follows from Schur's 
lemma that 

A"=/, B n = I (3) 

where / is the (n x n) unit matrix. The order of any element of an irreducible 
Abelian rotation group in n dimensions is consequently a factor of n. In the 
diagonal representation for B, i.e. 

£ = diag(l,e,£ 2 ,...,£ n ~ 1 ) (4) 

the matrix A has the form of a cyclic permutation matrix 

A = 







1 . 



. 1 







Santhanam 229 

The action of A and B on the components of an n -dimensional vector is then 

A: JCfc = Xk+i 

B:x' k = e k x k 

More generally, 

A x k — x k+s 

B': x' k = e k 'x k (7) 

s, t = any integer 

The transition to continuous groups is now carried out. Following Weyl (1931) 
we set 

A=e i(P 
B = e ir, ° 


where £ and rj are real infinitesimal parameters and we pass to the limit n -» oo. 
e has therefore to be identified as 

which yields 

We see that 


s = e 2OT/ " = e iiv 

ngri = 277- 

S' = e""°^e ,TO , r = r,t 

e k, = Q Unkt = e iikT 




the eigenvalues of Q are given by 

q = £k mod ni- 

where k runs through all integral values. As ii^ = 2tt/t], by choosing 77 
infinitesimal, we see that q may assume all real numbers from -00 to +00. In 

x k =J&K€k), fr=q (14) 

where if/(q) is an arbitrary function satisfying the normalization 

jl</K<7)| 2 d<? = 



We find then that the quantity e' T " is represented by the linear operator 

*(«)-► e'-Vte) (16) 

230 Uncertainty Principle and Foundations of Quantum Mechanics 

and the operator representing e" Tp 

*foWfo + <r) (17) 

From equation (17) it is clear that & icrP acts as the translation operator. If these 
linear operators are infinitesimal we have 

q:8if/(q) = q*l>(q) 

/v Id 


Thus, it follows that the Schrodinger representation (wave equation) is a 
necessary consequence of Weyl's commutation relation. To summarize we 
have the theorem (of von Neumann) as follows. 

Theorem. Let U(€) = e iiP and v(-n) = e ir, ° be one parameter continuous 
unitary groups on a separable Hilbert space X satisfying the Weyl relation 

U(£)v(r,) = e ,en v(Ti)m€) (I 9 ) 

Then there are closed subspaces %6 t so that 


(1). #f= © W t (N is a positive integer or oo) 

(2). U(i):X, + X, 

»(»}): ar,->a!i forall£i7eR 

(3). For each /, there is a unitary operator T,: #f 7 ->L 2 (R) such that 
TtUffllT 1 is translation to the left by f and TMv^ 1 is multiplication by e ,T,x . 
It also follows from the theorem that if P and Q denote the generators of U(€) 
and v(rj), respectively then there is a dense domain £><=#? so that 

(a) P: D-+D 

(b) [Q,P]<f> = i<f>, for all 4 eD. 


(c) P and Q are essentially self adjoint on D. 

Thus the Schrodinger representation is the only representation of CCR. It is 
not difficult to see that a pair P, Q of self-adjoint operators satisfying the 
canonical commutation relation [Q, P] = il cannot both be bounded. If they 
were bounded then 

PQ"-Q"P = -inQ nl 

and thus 

and hence 




Santhanam 231 

for all n which is a contradiction. Therefore, either P or Q or both must be 
unbounded. This can also be seen by simply taking the trace on both sides of 
Mackey (1949) replaced v{ri) = e ,17 ° by its spectral measure E such that 

v(v) = \^ 


The measure E and U{tj) is an imprimitivity system for R based on R and the 
uniqueness theorem is then a consequence of the imprimitivity theorem. 


In this, we essentially follow the method of Schwinger (1960) in the construc- 
tion of a unitary operator A given by the cyclic permutation matrix. The action 
of A in Dirac's notation is 

where we identify 

(a k \A=(a k+1 \, A: = 1,2,. 

(a \ = {a | 



The eigenvalues of A are then the n roots of unity 

v' = e k , k = 0,l,2,...n-l 
e = e 2 " v " 





The Sylvester matrix defined as 

5 = 


l : 

l i 

1 e 

1 e r 




SS + = S + S = I 

diagonalizes any circulant matrix and in particular the cyclic permutation 
matrix A. Hence 



232 Uncertainty Principle and Foundations of Quantum Mechanics 


S = diag(l,e,e 2 ,...e"- 1 ) (26) 

Thus any two operators connected by the transform equation (25) have as their 
eigenvalues the n roots of unity and if we denote their eigenvectors (of A and 
B) by |e*> and \e l ), respectively, then 

(e'\e k ) = S kl 


= —=e 



In fact, we have the Theorem of Schwinger. 

Theorem. The basis of any finite dimensional vector space can be mapped by 
a unitary transformation (Sylvester transform) to a basis furnished by the roots 
of unity. 

Suppose we have two unitary operators U and V satisfying 


e" = l (28) 

£/" = V=7 

Then the n 2 operators defined by 


X kl = -=U k V l , k, I = 0,1,2, 



are linearly independent. All X kl except the unit operator are traceless and with 
the multiplication defined by equation (28) form an associative algebra c 2 . We 
shall later discuss the representations of the generalized Clifford algebra c n m . 
Suffice it now to say that c 2 is isomorphic to the matrix ring M nXn . Thus, the 
operators defined in equation (29) furnish a unitary operator basis and this fact 
has been particularly used by Schwinger (1960). Now consider an arbitrary Y. 
We notice that 

I X kl YXl, =-ZU k V'YV-'U' k 
k,i n k,i 

=-1v k u'YU-'\r k 


n k,i 

It is easy to show that this operator commutes with U and V. Hence 


I X kl YXh=rI 


r = TiY 



Santhanam 233 

We refer to U and V as a complementary pair of operators. From equation (3 1) 
it follows that 

u k v l uv-'ir k = e'u 
u k v l v\r l u l = e~ k V 


which exhibits the unitary transformations that produce only cyclic spectral 
translations. If Y is an arbitrary function of U and V we see from equation (3 1) 

2lF(e'i7,£- k V) = -TrF 



which is a kind of ergodic theorem. 

The operators V 1 and U k ,l, k = 0, 1, 2, ... n - 1 will satisfy the same operator 

relation as V and U, viz. 

V l U k =e 27Ti/n U k V' 

kl = 1 mod n 
with the unique solution given by the Fermat-Euler theorem 

/ = Jt* ( " )-1 modn 


where <j>(n) = the number of integers less than and relatively prime to n. The 
pair of operators U k , V 1 are also complementary. Suppose now we write 

n = itin 2 (37) 

where the integers are relatively prime. 
Then we can rewrite 

2-ni/n-k 2mk l /n l 2mk 2 /n 2 



k = k 1 n 2 + k 2 n x 
k x = 0, 1,2, ...«!- 1 
k 2 = 0, 1,2, ...n 2 -l 
Thus a single basis defined by e k can be written as 

k fc ) = l4 1 >k2 2 ) 

where «?i and e 2 are of periods «i and n 2 , respectively. Then the single pair of 
complementary operators can be replaced by two pairs satisfying 



V 1 U 1 = e 1 U 1 V 1 



234 Uncertainty Principle and Foundations of Quantum Mechanics 




V 2 U 2 = e 2 U 2 V 2 

U 1 =U n \ V!=V' 
U 2 =W\ V 2 =V> 


l x = „+ ("I*- 1 mod «! 

/ 2 = „f("2)-i mod „ 2 
The two pairs commute with each other. The basis now becomes 

X kl -*X klll , k2l2 = -±=U k l 'U k 2 >V l 1 1 V l ? 




ki,h = 0,l,2, 
k 2 ,l 2 = 0,l,2,...n 2 -l 

Xu — n x kjtj 

■A-kfy I — *-> i r I 







*„/, = (), 1, 2,. ..ny-l 
In general, since any integer can be written in terms of primes 

n={[v, (48) 

where / is the total number of primes including repetitions. The resulting 
factored basis is then 






Santhanam 235 

In the particular case of v = 2, the complementary pair of operators anticom- 
mute and the basis forms the Clifford algebra. In the next section we shall study 
the structure of generalized Clifford algebras. 


The problem that Dirac (1926) faced was to linearize 

X 2 i+xl + ...+X 2 m ={ t OiiX,) 

and consequently the a 's satisfy 

a j a j = -a j a i , i¥=j 
«? = / 
i,j = l,...m 

The set of elements defined by 


a=n«?'> A 1 = 0,1 




which are 2 m in number are linearly independent and with the product defined 
by equation (52) form an associative algebra which is the familiar Clifford 
algebra c 2 m . In this case, it is well-known Boerner (1963) that when m = 2v = 
even, there is a single irreducible representation of dimension 2" and c!„= 
M 2 "x2"- When m = 2v + 1, CL+i = C\ v + CL where the elements of the second 
are simply the negatives of the first. The elements defined by 

an = [«,, «,] 


satisfy the algebra 0(m + 1) of the orthogonal group in (m + 1) dimensions. In 
fact, they furnish the spinor representation of 0(m + 1). A natural question 
arises whether one can solve the linearization 


x"+x 2 + ...+x 
The answer is yes if the e's satisfy the ordered commutation relation 

i,j = l,2,...m 

E = e M/n 
e" = l 



236 Uncertainty Principle and Foundations oi Quantum Mechanics 

The basis defined by 


T, = 0,l,2,...n-1 


which are n m in number are linearly independent and with the product given by 
equation (56) form an associative algebra called the generalized Clifford 
algebra C n m . Morinago and Nono (1952), Yamazaki (1964) and Morris (1967) 
have studied the algebra in detail and Ramakrishnan and co-workers (1969b) 
have studied exhaustively the particular realizations and their connections with 
physical problems. Sufficient for our purpose to state the following theorem. 

Theorem: When m = 2v, the algebra C m has one irreducible representation 
of dimension n" and C2„» M n ^ n -, the matrix ring of dimension n". When 
m = 2v + l, C n 2v+X = C n 2v +C± v +... + C5„ (n copies). The elements of the^ sec- 
ond, third, etc., are obtained from the first by multiplication with e',i = 

The explicit realizations can be obtained either by a straight forward extension 
of the method of Brauer and Weyl (1935) or by using the method of Ramak- 
rishnan (1967) or by Ramakrishnan and co-workers (1969) an extension of 
Rasevskii's method (1969b). It should have become obvious by now that the 
algebra satisfied by the canonical pair of unitary operators U and V of the last 
section is simply C 2 =M nXn . Since in the basis X kh all except the unit element 
are traceless, suitable linear combinations furnish a Hermitian basis and thus 
give the self representation of the group su (n). (Ramakrishnan and co- 
workers, 1969a). 


We now start with the Weyl algebra. We have seen that it has a single 
irreducible representation (which in the limit of infinite dimensions is in fact, 
the Schrodinger representation) in finite dimensions. We define then the 
generators as the 'formal' logarithm of the Weyl operators and we solve for 
them. Then we compute their commutator. We then show in the limit of infinite 
dimensions it reduces to the one of Heisenberg. 
We start with the operators V{g) and U(tj) satisfying Weyl algebra 

VU = eUV, 

e = exp 


V" = U"=I 

Santhanam 237 

We have seen that the single irreducible representation of equation (58) in 
finite dimensions is given by 

[/ = diag(l,e,e 2 ,...£" _1 ) 

We define the Hermitian operators P and Q by 



where we have chosen f = 17 = ^2ir/n. 
Then, formally 

Q = -i\l — log U 



The logarithms of U and V are well-defined since they are non-singular 
(Gantmacher, 1959). Further since 

S~ 1 VS = U (62) 

where S is the Sylvester matrix 

s = 


1 e* 

1 e"- 1 e"' 2 



S" 1 = 5 t 


From the definition of the logarithm of a matrix, in view of equation (62), we 

log V= Slog US' 1 



= -iV=-k>g£7 


P = -iJ^-S(logU)S- 1 
Y 2tt 





238 Uncertainty Principle and Foundations of Quantum Mechanics 

The commutator of Q and P is then 

K„ = [<?, P] rs = -^-[log U, 5(log ms-% 



where the matrix indices are labelled from (0, 1,2, . . . n-1). Explicitly 
evaluating equation (67), we have 

If e r ' s =x = l, then 

^ = _(log^ (r _ 5) "- M£U ,- s) 
2ir u=o 


If x * 1, since x" = 1, there results 

doge) 2 





Thus, we find 


n-1 ft 

u=0 ■* L 

\(n{n-\) ife r " s = l 

ife ,- Vl 







We notice that this commutator is off-diagonal and hence trace-free, as it 
should be for bounded operators. 

Alternately, we can directly sum the expression in equation (68) which yields 

[Q,P] rs = 0, r = s 

— (r-s)-n[---cotg-(s-r)J,r^; 


We call the commutation relations (72) or (73) quantum mechanics in discrete 
space (QMDS). 

Let us now evaluate the following commutator which we shall use when we 
discuss the application of QMDS to the algebra of angular momentum 
operators later. Let 

n = [Q, V] (74) 

From equations (65) and (66) it is clear that 

a = -iV — log e [N, SUS' 1 ] 
" 2tt 

where the matrix 

JV = diag{0,l,2,...(n-l)} 

Explicit evaluation of equation (75) yields 

where the matrix 

K = 

Sl = iyJ— (log e)K 


1 ... 

1 ... 

1 ... 

Santhanam 239 




... 1 

L — (n — 1) ... 

= V-L 

L ~in ... oJ 




We shall now show that the commutator of the bounded operators given by 
equation (72) does reduce to the usual form of Heisenberg in the limit of 
continuous spectrum, valid as n -» oo. Beginning with equation (68), we relabel 
the rows and columns from -(n — 1)/2 to (n — 1)/2 instead of from to n - 1 
and replace the sum by the integral. In other words, we let the matrix indices 
take continuous values and take the limit «-»oo. This is the method of 
Heisenberg (193 1) and Dirac (1930) to pass from the discrete to the continuous 
case. Then we have, 

(l og£ ) 2 f°° 

[Q,P\s = ^ (r—s)- u exp{2mu(r-s)/n}du 

2ir J-.*, 

A t °° / \ 

= -i(r-s)— exp{27r/w(7--5)/n}d(-J 

d(r-5)J_oo \n/ 

= -i(r-s)S'(r-s) 

= iS(r — s) 

240 Uncertainty Principle and Foundations of Quantum Mechanics 

where S, 8' are the Dirac delta function and its derivative, respectively. We 
have used the fact that in the limit considered above log e = 2m/ n, i.e. we have 
retained only its principal part. 

This is exactly what Weyl does, by choosing the parameter 17 infinitesimal. 
The same limiting method can be demonstrated by starting from equation (73). 
Thus, QMDS reduces to the usual theory in the limit mentioned. Since the 
representation of the Weyl group is unique, it is clear that QMDS is unique too. 
Since the commutator (QMDS) is off-diagonal, the diagonal measurements 
'commute' and hence there is no 'uncertainty'. However, the commutator is not 
zero as in the classical case. 


In this section, we apply the concept of QMDS studied in detail in the previous 
sections to the study of the angular momentum operators which provide an 
excellent example of bounded operators with a discrete spectrum. We shall 
reformulate the angular momentum algebra by quantizing the azimuthal 
direction instead of the zenithal angle (Levy-Leblond, 1973). We also briefly 
remark on the related problem of defining a phase canonically conjugate to the 
number operator which has a lower bound. 

Denoting the generators of rotation by J x , J y and J z , we know that they satisfy 
the commutation relation 

[Ji,Jj] = ie iik J k , i, jk cyclic (80) 

or in the Cartan canonical form (choosing J z diagonal) 


J ± = J x ±U y (82) 

What is usually done is to choose a basis diagonal in J z and J 2 = J\ + J y + J z i.e. 

J z 4i' m = mijj' m 

/Vm=;'(/ + i¥L -j^rn^j 

By polar decomposing /+ we have 

J+ = J T Y (84) 

with J T Hermitian and Y unitary. Taking the adjoint yields 

/_ = (/ + ) t ='T- 1 / T (85) 

It follows that 



J + J- = J 2 T =J 2 -J 2 Z +J Z 


Santhanam 241 

We define the transverse component J ± as 

J+=YJ X (87) 

and hence 

J 2 ± =j\J + = J 2 -J 2 z -J z ... (88) 

In the (J 2 , J z ) diagonal basis given by equation (83) we find from equations (86) 
and (87) that (choosing a phase convention) 

</m|/ T |/>i> = S mn [(j + m)(j-m + 1)] 1/2 
(jm |/J/«> = S mn [(j -m)U + m + 1)] 1/2 


It follows from equations (84) and (89) and equations (87) and (90) that in this 
basis the operator Y is just the cyclic permutation matrix A defined in the last 
section. In fact, 



N=-J z +jI 



where / is the unit matrix. 
From equation (77) it can be seen that 

[J Z ,A]-A=-L (93) 

which has been derived by Levy-Leblond. It is also clear f om equation (72) that 

[J z , 4>] rs = /(log e)-^T-T for e r ~ V 1 
e — 1 

= fore'- s = l (94) 

Since y(=A) acts as a cyclic permutation matrix for the basis in which J z is 
diagonal it follows (Schwinger) that (J 2 , Y) diagonal basis given by 

J 2 \i,n) =/(/+ DI/» 

Y\j,fi) = iu,\j,fi) 

H = e i , £=-]',...+]', 
is connected by the Sylvester transform to the (J 2 , J z ) basis equation (83), i.e. 


\UC)= Z S (m \j,m) 


(2/ + 1) 

172 I e mC \j,m) 


242 Uncertainty Principle and Foundations of Quantum Mechanics 

The matrix elements of J z in the new basis is given by 

2} + \ m =-i 


Of course, one knows that in a finite dimensional space we can go from one 
basis (jm) to another basis (/£) by a unitary transformation (s). But what is 
important has been the fact that the operator Y (not J ± ) acts as a cyclic 
permutation matrix in the J z diagonal basis and the commutation relation of Y 
with J z is furnished by equation (93) which carries the essence of our QMDS. 
From equations (92) and (95) it follows that 

[N,<f>l s = -i(\oge) 

fore r - s = l 

f or r - s # 1 


Thus the number operator N (with spectrum bounded below) and the phase 
operator are conjugate in the sense of equation (98). It is perhaps a great luxury 
to demand that they must be canonically conjugate. 


We have discussed the quantum mechanics of bounded operators with a 
discrete spectrum acting on a finite dimensional Hilbert space. We have shown 
that in the limit of continuous spectrum with the dimension going to infinity one 
gets the usual theory. As an illustration we have studied the algebra of angular 
momentum operators reformulated by quantizing the azimuthal angle. We 
have briefly remarked about the phase operator canonically conjugate to the 
number operator. We believe that the QMDS can be applied to a system with 
periodicity like the cyclic lattice. Also, we may avoid many difficulties 
(divergences etc.) if we work with a finite number of states and eventually take 
suitable limits. It is to be seen how QMDS works with realities. 


I thank Professors B. Gruber and W. Hink for their gracious hospitality at the 
University of Wiirzburg. The article was written during my stay at the Inter- 
national Centre for Theoretical Physics, Trieste. I am grateful to Professor 
Abdus Salam and the I.A.E.A. for their hospitality. A brief discussion with 
Professor C. N. Yang is gratefully acknowledged. 

Santhanam 243 


Boerner, H. (1963) Representations of Groups, North-Holland, Publishing Co., Amsterdam, Chap. 

Brauer, R. and Weyl, H. (1935) 'Spinors in n dimensions', Am. J. Math., 57, 425-449. 
Cartier, P. (1966) 'Quantum mechanical commutation relations and theta functions', Proc. Symp. 

Pure Math., 9, 361-383. 
Dirac, P. A. M. (1926) Proc. Roy. Soc., 109 A, 642. 

Dirac, P. A. M. ( 1 930) The Principles of Quantum Mechanics, Oxford University Press, London . 
Dixmier, J. (1958) 'Sur la relation i(PQ - QP) = V, Compositio Math., 13, 263-269. 
Gantmacher, F. R. (1959) Matrix Theory, Vol. I, Chelsea, New York, p. 239. 
Heisenberg, W. (1925) Zeit. Phys., 33, 879. 

Heisenberg, W (1931) The Physical Principles of the Quantum Theory, Dover, New York. 
Levy-Leblond, J. M. (1973) 'Azimuthal quantization of angular momentum', Rev. Mex. Fis., 22, 

Mackey, G. W (1949) 'On a theorem of Stone and von Neumann', Duke Math. J., 16, 313-326. 
Morinaga, K. and Nono, T. (1952) J. Sci. Hiroshima Univ., A6, 13. 

Moris, A. O. (1967) 'On a generalized Clifford algebra', Quart. J. Math. Oxford (2), 18, 7-12. 
Nelson, E. (1959) 'Analytic vectors', Ann. Math., 70, 572-615. 
von Neumann, J. (193 1) 'Die Eindeutigkeit der Schrodinger'schen Operatoren', Math. Ann., 104, 

Ramakrishnan, A. (1967) 'Dirac Hamiltonian as a member of a hierarchy', J. Math. Anal. Appi, 

Ramakrishnan, A., Chandrasekaran, P. S., Ranganathan, N. R., Santhanam, T. S. and Vasudevan, 

R. (1969a) 'Generalized Clifford algebra and the unitary group,/. Math. Anal. Appi, 27, 164. 
Ramakrishnan, A., Santhanam, T. S. and Chandrasekaran, P. S. (1969b) 'Representation theory of 

generalized Clifford algebras', /. Math. Phys. Sci. (Madras), 3, 307. 
Rellich, F. (1946) 'Der Eindeutigkeitssatz fur die Lasungen der quantum-mechanischen Vertaus- 

chungsrelationen', Nachr. Akad. Wiss. Gottingen, Math. Physik., Kl, 107-115. 
Santhanam, T. S. and Tekumalla, A. R. (1976) Quantum Mechanics in Finite Dimensions, 

Foundations of Physics, 6, 5, 583-587. 
Schwinger, J. (1960) 'Unitary operator bases', Proc. Nat. Acad. (USA), 46, 570-579. 
Stone, M. (1930) 'Linear transformations in Hilbert space III, operational methods in group 

theory', Proc. Nat. Acad. Sci. (USA), 16, 172-175. 
Weyl H. (1931) Theory of Groups and Quantum Mechanics, Dover, New York, pp. 272-280. 
Yamazaki, K. (1964) 'On projective representations and ring extensions of finite groups', /. Fac. 

Sci. Univ. Tokyo, Sect., T 10, 147-195. 


Formal Quantum Theory 

Four Approaches to Axiomatic Quantum Mechanics 


University of Denver, Denver, U.S.A. 


This is a survey article on contemporary approaches to axiomatic quantum 
mechanics. There are, at present, four main frameworks within which axioma- 
tic quantum mechanics is being studied. These are the classical approach, the 
algebraic approach, the quantum logic approach and the convexity approach. 
Each of these approaches has its advocates and critics, its strengths and 
weaknesses, its history and literature. To do justice to any one of these 
approaches, an entire volume could easily be dedicated to each. Therefore, by 
necessity, this.survey must be fairly superficial. I shall include an introduction 
to the framework of each approach, some of the interrelations between them 
and a few of the important results they encompass. I hope to give the reader a 
unifying viewpoint, expose him to results that are scattered throughout the 
literature and not previously compiled in one place, and finally to announce 
some little known and new results. 

The importance of axiomatic quantum theories to mathematics and physics 
has perhaps not been sufficiently recognized. This field is not only important in 
its own right, but has had tremendous influence and spin-off to other areas. For 
example, in the physical sciences there are important applications of results, 
methods and concepts of these theories to statistical mechanics, ther- 
modynamics, turbulence, solid-state physics and laser physics, among others. 
In mathematics, many of the most active areas of research owe their original 
conception and/or later development to axiomatic quantum mechanics. These 
include: Hilbert spaces, self-adjoint and symmetric operators, spectral theory, 
general operator theory, von Neumann algebras, Lie groups and algebras, 
group representations, Schwartz distributions, C*-algebras, Jordan algebras, 
modular lattices, orthomodular lattices, continuous geometries and functions 
of several complex variables. Axiomatic quantum mechanics is a prime exam- 
ple of the fertile interplay between mathematics and physics. Even if the 
present methods, concepts and results prove to be absolete and are eventually 
superseded, the applications and mathematics that they have inspired will 
justify their existence. 


248 Uncertainty Principle and Foundations of Quantum Mechanics 

It is generally recognized that the two most basic concepts in quantum 
mechanics are those of a state and an observable. These two concepts serve as 
the basic building blocks of most axiomatic theories and, in particular, the 
approaches considered here. Each of the four approaches of this article will 
have as its primitive axiomatic elements one of these entities and this is one of 
our unifying themes. More specifically, in the classical approach, the observa- 
bles are assumed to be self -adjoint operators in a Hilbert space; in the algebraic 
approach, the observables are taken to be elements of a C*-algebra; in the 
quantum logic approach, the axiomatic elements are certain types of observa- 
bles called propositions; and in the convexity approach, the states are taken to 
be elements of a convex structure. 


The classical approach to axiomatic quantum mechanics is not only the 
prototype for other approaches, it is the most widely used and probably the 
most popular among physicists. It was originated by Dirac (1930) and von 
Neumann (1932). There are three equivalent formulations for this approach. 

(A) Formulation 1 

The observables O of a physical system are described by self-adjoint linear 
operators acting on a complex Hilbert space H. Thus in this formulation, the 
observables are the basic axiomatic elements. We shall usually identify an 
observable with its corresponding operator. 

A state of a physical system is a complete description of the preparation or 
condition of the system. In quantum mechanics the state is determined by the 
expectations or average values of the observables when the system is prepared 
according to that state. Thus, the states can be described by the set of 
expectation functional & on O. For s e Sf and A e O, we define s(A) to be the 
expectation of A in the state corresponding to s. Of course, for an unbounded 
observable A, the expectation s{A) may not exist. It is usually convenient, 
therefore, to consider the set of bounded observables O b . These are described 
by bounded self-adjoint operators. The set O b is still large enough to determine 
a state. 

What are the properties of the functional s(A) for s e Sf, A e O b ? First, the 
identity operator / corresponds to the observable that always has the value one 
so s(I) = 1. Also, if A is a self-adjoint operator with a non-negative spectrum 
(this corresponds to an observable with non-negative values) then we should 
have s(A)>0. Furthermore, one can argue that two observables whose 
self -ad joint operators commute are simultaneously measurable; that is, a 
measurement of one does not interfere with a measurement of the other. In this 
case, the expectation of their sum should be the sum of their expectations. In 

Gudder 249 

slightly more general form s(aA +&B) = as(A)+Ps(B) for all a, fi eR (R is 
the set of real numbers) whenever A and B commute. Finally, a continuity 
condition is imposed. This is justified by the fact that if two observables are 
'close', then their expectations should also be 'close'. But how do we define 
'close' mathematically? Here it is defined in terms of strong convergence. That 
is, a sequence of bounded operators A,- converges strongly to a bounded 
operator A if A,<f> -> A<f> for every <f>eH. The continuity condition is given as 
follows: if a sequence of bounded observables A, converges strongly to A, then 
s(A,) + s(A). 

Using a truly amazing theorem due to Gleason (1957), the above four 
conditions give the following characterization of states. For every s e SP there is 
a positive trace class operator T s (the density operator) of trace 1 such that 
s (A ) = Tr ( T,A ) for every A G O b . Thus it follows that a state is not only a linear 
functional on O b but it is given by a density operator. 

There are many interesting consequences of these simple, far-reaching 
axioms. We now mention two of them and others will be seen later. Let <f> be a 
unit vector and let P& be the one-dimensional projection onto the subspace 
determined by <f>. Then P+ is a self-adjoint operator and also a density operator. 
Thus P+ can be interpreted as both an observable and a state. This dual nature 
of Pj, makes it possible to define transition probabilities in a succinct manner. 
The significance of i% as a state is that P^ is a pure state since it cannot be 
written as a convex combination of other states. As an observable, P# can be 
interpreted as corresponding to the statement, 'the system is in the pure state 
/y. If P* and P^, are two pure states, then Tr (P*P*) is the expectation of the 
observable P+ in the state P+ and is interpreted as the probability that the 
system will be found in the state P^, when we know it is in the state P+. This is the 
transition probability between the two states and is given by 

Tr(P,^V) = <6JV^ = |<<fc*>| 2 

Notice that for a pure state P^, the expectation of an observable A is given by 
Tr (P+A) = {<f>, A<f>). Another interesting consequence of the axioms is the fact 
that the product of the variances of two observables is given by 

s([A -s(A)?)s([B -s(B)f) > l/4[s([A, B])f 

This inequality provides a lower bound for the simultaneous measurability of A 
and B and is a mathematical formulation of the Heisenberg uncertainty 
principle. It also shows that two commuting observables are simultaneously 
measurable which substantiates our earlier statement to that effect. 

The dynamics of the system is also easily formulated within this framework. 
If an observable is given by the operator A at time t = 0, then this observable is 
given by an operator W,A at time t. This is the Heisenberg picture of the 
dynamics. There is an equivalent formulation called the Schrodinger picture in 
which the observables are kept constant and the states are assumed to evolve. If 
the system is in a pure state given by a unit vector <£ at time t = 0, then at time t 
the system is in a pure state given by a unit vector Wrf>. It follows by a theorem 

250 Uncertainty Principle and Foundations of Quantum Mechanics 

of Mazur and Ulam (1932) (there is also a related theorem due to Wigner 
(1931)) that W, is given by a unitary operator U t . Furthermore, since 

(4>, WA4>) = <U4>,AU4>) = (4, U7 x AUrf>) 

we see that WA = U^AU t . 

If the state at time t x is given by U tl <f>, then the state at time t 2 + t r is given by 
U t2+ti <t> = U, 2 U tl <f> and so we must have U, 2+tl = U, 2 U tl for every t u t 2 e 
(-00,00). Furthermore, letting t 2 = and t 2 = -t x in the above identity gives 
U = I and U-, = UT 1 so U, forms a one-parameter group of unitary transfor- 
mations. It is also usually assumed that t >-> U, is strongly continuous. It follows 
from Stone's theorem (Stone, 1930) that there exists a unique self-adjoint 
operator H such that U,<f> = e~ iH °'4> for all te (-oo, oo). The operator H is 
identified with the Hamiltonian of the system. The differential form of the 
evolution laws become: 

^-<f>, = ^-U4> = -iH Ud> = -iH <t>, 
at at 

^-A, = ^ UT'AU, = i[H , UT'AU,] = i[H Q , A t ] 
at at 

The first of these equations is called Schrodinger's equation. 

(B) Formulation 2 

Formulation 1 is the prototype of the algebraic approach to axiomatic quantum 
mechanics. We next briefly consider an equivalent formulation which is the 
prototype to the quantum logic approach. In this formulation the primitive 
axiomatic elements are the 'propositions' of a physical system. A proposition of 
a system represents a special type of observable that has at most two values 
and 1 or true and false. For example, a counter which is either activated or 
unactivated is a proposition. A filter that passes only certain types of particles is 
a proposition since a particle either passes or does not pass. The basic postulate 
is that the propositions of a physical system are described by the set of closed 
subspaces 9> of a complex Hilbert space H. This is equivalent to describing the 
propositions by orthogonal projections on H. 

The sets of states Sf' of the system now determine the probabilities that 
propositions are true. Thus, if s e Sf ', P e 5 s then s(P) e [0, 1]. Since the identity 
projection / corresponds to the proposition that is always true, we must have 
s(I) = 1 for every s e Sf'. If P, is a sequence of mutually orthogonal projections 
(i.e. PiPj = 0, / * j) then the probability that £ P, is true should be the sum of the 
probabilities that each P, is true. We therefore postulate that s£ P) = Z s(P) 
for every s e Sf . In other words a state is described by a 'probability measure' 
on &. Again by Gleason's theorem, if s e Sf' there exists a unique density 
operator T s such that s(P) = Tr (TJ>) for every P € 9>. 

Gudder 251 

The general observables come into the theory in the following way. If X is an 
observable and £ is a Borel subset of the real line then the pair (X,E) 
corresponds to the proposition: 'X has a value in the set E\ Thus X can be 
thought of as a map from the Borel sets B(R) into @. It is easy to justify that 
X: B(R)^ & should satisfy 

(1). X(R) = I; 

(2). IiEf]F=0,thenX(E)X(F) = O; 

(3). If E t are mutually disjoint, then X(UEi) - 


Thus X can be thought of as a projection-valued measure. By the spectral 
theorem their exists a unique self-adjoint operator A such that A = \ XX(dX ). 
conversely, if A is a self-adjoint operator, then there is a projection-valued 
measure E<-+P A (E) such that A = J AP A (dA). Thus there exists a one-to-one 
correspondence between observables and self-adjoint operators. 

If s e Sf" and A is an observable, then s\P A (E)] is the probability that A has a 
value in the set E when the system is in the state s. It is then clear that the 
expectation of A in the state s is 

s(A) = J As[P A (dA)] = J A Tr [P s P A (dA)] = Tr [r s | AP A (dA)] 

= Tr(P s A) 

We thus see that corresponding to a state s e Sf as defined in this formulation 
there is a state le^as defined in Formulation 1 . Conversely, if s e S" as defined 
in Formulation 1 then in particular s can act on projections. If we restrict s to ^ 
then s is a state in Sf as defined in Formulation 2. For this reason, Formulations 
1 and 2 are equivalent. 

(C) Formulation 3 

This formulation of the classical approach to axiomatic quantum mechanics is 
the prototype of the convexity approach. In this formulation the states form the 
basic axiomatic elements. The basic postulate is that the states of a physical 
system can be described by density operators S" on a complex Hilbert space H. 
Convexity comes into play since Sf is a (strong) convex set. That is, if T t € Sf and 
I A; = 1, A, > 0, then £ k^ e Sf. We call £ A,7; a mixture of the states T t . The 
extreme points of Sf are the states that cannot be written as mixtures of other 
states. These states are also called pure states and are given by one-dimensional 

In Formulation 1, we mentioned that a state is determined by the expectation 
values it gives to observables. In that formulation the observables were the 
basic axiomatic elements and the states were derived from the observables. We 
now turn the situation around. Now the states are the axiomatic elements and 
we shall derive the observables from the states. Hence if A is a bounded 
observable (we are not now assuming that A corresponds to a self-adjoint 

252 Uncertainty Principle and Foundations of Quantum Mechanics 

operator, this will follow automatically) and T e Sf a state, we define A ( T) to be 
the expectation of A in the state T. We now seek the properties of the function 
T>-+A(T). First, it is natural to assume that this function preserves convex 
combinations. That is, 

whenever A, >0, 1 A, = 1. Second, since observables must have real expecta- 
tions, we assume A(T)eR for every Te Sf. Finally, since states that are 'close' 
should give expectations that are 'close', a continuity condition is imposed. This 
continuity condition is usually given in terms of the trace norm. This norm is 
defined as follows. If T 1 ,T 2 e^ > then there is a unique positive trace class 
operator T 3 such that T% = {T x - T 2 ) 2 . This operator is denoted \Ti~T 2 \. The 
trace norm of T x -T 2 is W^-TJtl = Tr|r 1 -T 2 |. The continuity condition 
becomes: if T b Te Sf and \T X - TH^O as i->oo, then A(r,)-> A(T) for every 
bounded observable A. If the above three conditions hold, it can be shown 
(Schatten, 1950) that for any bound observable A there exists a unique 
bounded self-adjoint operator A on H such that A(T) = Tr (TA ). We thus 
see that this formulation is equivalent to Formulations 1 and 2. 

(D) Strengths and Weaknesses 

Now that we have seen a brief formulation of the classical approach, a natural 
question is, why look at other approaches? Is there something wrong with this 
approach that makes it necessary to abandon or modify it? A lucid discussion of 
this question can be found in (Emch, 1972). Here we shall limit ourselves to a 
few comments. 

Let us begin with the strengths of the classical approach. First, and most 
important, this approach has been highly successful, especially for systems with 
a finite number of degrees of freedom. Second, it has the advantage of 
concreteness. The observables can be identified with self-adjoint operators, the 
states with density operators, the propositions with projection operators and 
the dynamics with unitary operators on a Hilbert space. Now all that is needed 
is to specify what Hilbert space is to be used and to give a prescription for the 
self-adjoint operator that corresponds to each observable. There is a very 
satisfactory way of doing this in the case of a finite number of degrees of 
freedom. Suppose the system has 3iV degrees of freedom in a cartesian 
coordinate system x u x 2 , x 3 ,..., x 3N . Then the Hilbert space is taken to be the 
space L 2 (i? 3N ,dA) of square integrable complex- valued functions on R 3N . 
Using the correspondence principle or other heuristic arguments, the position 
observables are prescribed as Q l f(x) = x i f(x), i = 1,2, . . . ,3N, and the 
momentum observables as PJ{x)= -ih(d/dXi)f(x), i = 1, 2, . . . , 3N. The 
other observables are now given in terms of these basic observables. The 
quantum mechanics is now completely described. 

Gudder 253 

The above procedure is satisfactory for the following reason. The position 
and momentum operators satisfy the canonical commutation relations 
[Q,, P k ] = ih8 jk . (To be perfectly rigorous one should work with the Weyl form 
of the commutation relations but for simplicity we shall be a little imprecise 
here.) Now by a theorem of von Neumann (1931), if Q°, P°, i = 1, 2, . . . , 3JV, 
are an (irreducible) set of self-adjoint operators on a Hilbert space H which 
satisfy the above commutation relations, then H is unitarily equivalent to 
L 2 (R 3N , dA) and Q°, P? are equivalent to <?,, P t defined above, respectively. 
Thus, if the framework is to satisfy these basic commutation relations the 
Hilbert space and the observables (and hence the states, dynamics, etc.) are 
uniquely determined within a unitary equivalence. 

Now for some of the weaknesses of the classical approach. Von Neumann's 
theorem does not extend to systems with an infinite number of degrees of 
freedom. In fact, one can show that in this case there are infinitely many (in fact, 
uncountably infinitely many) inequivalent representations of the canonical 
commutation relations. Each of these representations gives different results. 
How is one to choose the 'right' representation? In fact, if one chooses the most 
'natural' representation, namely Fock space, the results are unsatisfactory 
when interactions are involved. This problem would be merely a mathematical 
curiosity if no important physical systm had an infinite number of degrees of 
freedom, but unfortunately all of quantum field theory lies within this range. 
Furthermore, this discussion would be unnecessary if the present quantum field 
theory were successful, but this is far from the case. 

A second weakness of the classical approach is the basic axioms themselves. 
Where does the Hilbert space come from? Why describe observables by 
self-adjoint operators? There seems to be no really convincing reason. In other 
words, the axioms are ad hoc, devoid of empirical evidence. There are some 
who believe that the troubles encountered in quantum field theory may be due 
to the basic axioms. They feel that if these axioms were established on a firmer 
empirical foundation then many of the difficulties would dissolve. 

Besides the basic axioms, there is another assumption, of a less important 
character, which is of a questionable nature. This is the continuity condition 
placed on the states. In Formulation 1, this was defined in terms of the strong 
convergence of operators. But there seems to be no physical significance for 
this type of convergence. In Formulation 2 this convergence is contained 
implicitly in the countable additivity of states. It is clear that states should be 
finitely additive but there is no physical reason for them to be countable 


The algebraic approach was initiated by Jordan, von Neumann and Wigner 
(1934). It was later developed by Segal (1947), Haag and Kastler (1964) and 
many others. In this approach, the bounded observables are taken as the 

254 Uncertainty Principle and Foundations of Quantum Mechanics 

primitive axiomatic elements. We begin with a slight modification of Segal's 

(A) Segal Algebras 

A collection of objects si is called a Segal algebra if si satisfies the following 

Axiom A. si is a linear space over the real numbers R. 
Axiom B. There exists in si an identity / and for every A € si and integer n > 
an element A" est which satisfies the following: If /, g and h are real 
polynomials, and /(g(A)) = h{\) for every A e R, then f(g(A)) = h{A); where 

f(A) = 0oI+ I P*A k if 


: such that the 

Axiom C. There is defined for each Aesia. real number \\A \\ = 
pair (, || • ||) is a real Banach space. 
Axiom D. \\A 2 -B 2 \\^m^(\\A 2 \\,\\B 2 \\) and ||A 2 || = ||A|| 2 . 
Axiom E. A 2 is a continuous function of A. 

Of course, a Segal algebra is supposed to describe axiomatically the set of 
bounded observables for a physical system. The underlying idea is that an 
observable is determined by its average values as given by laboratory experi- 
ments. Let A be an observable and A eR. If the average values of A as 
determined by a laboratory experiment are multiplied by A, then this deter- 
mines a new observable AA. If the average values of two observables A and B 
are added, then this determines a new observable A+B whose average values 
are these sums. This argument justifies Axiom A. Unfortunately, this proce- 
dure cannot be used to define products since in general the expectation of a 
product is not the product of the expectations. For this reason, products of 
observables are not defined. Axiom B states that polynomials in a single 
observable exist and enjoy the usual properties. The norm in Axiom C can be 
thought of as the maximum absolute expectation value of an observable. The 
properties of a norm then easily follow. The completeness is included for 
mathematical convenience since if the system were not complete it could be 
completed in the usual way still preserving all the axioms. Axiom D can be 
justified in terms of the interpretation of the norm given above. Axiom E is a 
natural continuity condition. We henceforth call the elements of a Segal 
algebra observables. 

An example of a Segal algebra is the set of bounded self-adjoint operators on 
a Hilbert space as given in Formulation 1 of the classical approach. Another 
example is the Banach space <€(!) of all continuous real-valued functions on a 
compact Hausdorff space T under the supremum norm. This example corre- 
sponds to a classical mechanical system. In this case, T corresponds to a phase 


Gudder 255 

space and C(T) is the set of observables (dynamical variables) which are 
necessarily compatible (or commuting or simultaneously measurable). 

Let si be a Segal algebra. A state of si is a real-valued linear functional a> on 
si such that <o (A 2 ) > for all A e si and co (I) = 1 . The states are supposed to 
describe the expectation values of the observables for a particular preparation 
or condition of the physical system. With this interpretation, the above 
properties of a state are clear. A collection of states SP on si is full if for any two 
distinct observables A, B there exists a state we^ such that cj(A)^cj(B). 
Segal (1947) has shown that any Segal algebra has a full set of states and that 

||A || = sup {MA )|:<o€<?} 

for all A e si. This latter fact justifies our interpretation of the norm as the 
maximum absolute expectation value of an observable. 

Although products of arbitrary observables are not defined, we can define a 
'symmetrized product'. For A, Be si the symmetrized product A ° B is 
defined by A ° B = §[(A +B) 2 -A 2 -B 2 ]. The physical significance of A ° B is 
not clear, although it is a convenient mathematical construct. This product does 
not enjoy very many algebraic properties. It is clearly commutative and from 
Axiom B, A ° I = A for every A € si. However, in general it need not be 
homogeneous ((AA)°B) = A(A ° B)), distributive (A°{B + C) = 
A ° B + A o C), or associative (A ° (B ° O = (A ° B) ° Q. In fact, the Segal 
algebra of all bounded self-adjoint operators on a Hilbert space is an example 
in which the associative law does not hold for the symmetrized product. We 
shall later give an example in which distributivity and homogeneity do not hold. 

Lemma 1. The symmetrized product is homogeneous if and only if it is 

Proof. Suppose the symmetrized product is homogeneous. It follows that 
-(A o B) = A ° (-B). Writing this out in terms of the definition gives 

(A+B) 2 + (A-B) 2 = 2A 2 + 2B 2 (1) 

It follows that 

A°B = l(A+B) 2 -(A-B) 2 ] (2) 

Now substitute A + C and A -C for A in (1) to get the following two 

[(A + B) + C] 2 + [(A - B) + Cf = 2(A + C) 2 + IB 2 

[(A+B)-C] 2 + [(A-B)-C] 2 = 2(A-C) 2 + 2B 2 

Subtracting these last two equations and using (2) gives 

(A+B)oC+(A-£)°C = 2(A°0 = (2A)°B 
Replace A+B by A and A -B by B to get 

A°C+B°C = (A+B)°C (3) 

256 Uncertainty Principle and Foundations of Quantum Mechanics 

Conversely, suppose the distributive law (3) holds. Then replacing A by B 
gives 2(5 ° O = (2B) ° C. Now replacing B by B/2 gives (\B) ° C = i(B ° O- 
Replacing A by IB gives 3(B»0 = (35) ° C, etc. Also, replacing A by -B 
gives (-B) ° C = -{B ° C)- In this way A (J3 ° C) = (A#) ° C for every rational A. 
But since addition and squaring are continuous, so is the symmetrized product. 
It follows by continuity that the symmetrized product is homogeneous. 

Corollary: If the symmetrized product is associative, then it is homogene- 
ous and distributive. 
Proof. If the symmetrized product is associative, then 

(AA ) ° B = (A/ ° A ) ° B = A7 ° (A ° B) = A (A ° B) 

The converse of this corollary does not hold as the Hilbert space example 
shows. We now give an example which shows that the symmetrized product 
need not be homogeneous (and hence not distributive or associative). This 
example is a simplified version of one due to Sherman (1956). Let X = R and 
define addition and multiplication by scalars in the usual ways. Let I = (1, 1, 1) 
and (al) n = a n l for n>0 an integer. If x = (x u x 2 , x 3 )eX, let x = maxx h 
x =minjc, and let X = {xeX:x = l,x = -1}. If xeX , define x n =x if n is an 
odd integer, and x" = I if n is an even integer. If x e X, then it is easy to see that 
there exists an x e X such that x = ax + pi, a, p e R. Define 

< n =(ax +PD n =i ("VlS-Vo 

It is easy to see that x" is well-defined. For xeXwe define 

||*|H*HI«*o + j3/|| = max{| j 3-a|,|a+/3|} 

Sherman (1956) has shown that with these operations X is a Segal algebra. 
Now let a = (1, 1, 0) and b = (1, 0, 1). We shall show that 2(a ° b) * (2a) "b. 
Indeed, a =|(1, 1, -1) and b=U\, -1, 1)+^ and so a +Z> =|(1, -1, -l)+z/: 

2(ao6) = [(a+6) 2 -fl 2 -A 2 ] = [(4,l,l)-(l,l,0)-(l,0,l)] 

= (2,0,0) 


(2a)ofe=§[(2fl + Z») 2 -(2a) 2 -^] = |[(9,5,l)-(4,4,0)-(l,0,l)] 

= (2,1,0) 

A Segal algebra is compatible if the symmetrized product is associative. A 
collection of observables is compatible if the subalgebra generated by the 
collection is compatible. Segal (1947) has proved that a compatible Segal 
algebra is isomorphic (algebraically and metrically) with the algebra <€(T) of all 
real-valued continuous functions on a compact Hausdorff space T considered 
earlier. It is well known that the states on <€(T) consist of the regular Borel 

Gudder 257 

probability measures on T; that is, if w is a state, there exists a regular Borel 
probability measure fi on T such that a> (/) = \ f d/x for every / 6 ^(T). It follows 
from these results that a compatible Segal algebra can be thought of as the set of 
observables in a classical mechanical system. Furthermore, compatible observ- 
ables can be thought of as being simultaneously measurable. 

(B) C*-Algebras 

The Segal algebras considered earlier were based upon axioms that had 
physical relevance. Unfortunately, their mathematical structure is so weak that 
not much further progress has been made in terms of using them for the study of 
quantum theory. To proceed further, additional axioms have been imposed 
which have not been given physical justification. One of these is to postulate the 
distributive law for the symmetrized product. In a mathematical sense, the 
distributive law is a rather mild required. In fact, by the proof of Lemma 1 this 
law is equivalent to requiring that (-A)°B = -(A°B) and (2A)°B = 
2(A ° B) for all A, Be si. However, the physical reasons for such a require- 
ment are lacking. Furthermore, additional axioms have been imposed which 
are not nearly so mild. These are best stated in terms of C*-algebras. 

We first review the terminology of C*-algebras. A complex algebra is a 
complex vector space equipped with an identity and a distributive, associative 
product AB. An involution on a complex algebra % is a map * of 38 into itself 
which satisfies (A*)* = A, (A+B)* = A*+B*, (AA)* = A*A* and (AB)* = 
B*A* for every A,5€f and complex A with A* denoting the complex 
conjugate of A. An involution algebra is a complex algebra equipped with an 
involution. A Banach algebra is an algebra equipped with a norm such that 
\\AB\\ < ||A || ||S|| and which is complete in the norm topology. A C*-algebra is an 
involutive Banach algebra 3& satisfying ||A*A|| = ||A|| 2 for every Aet An 
example of a C*-algebra is the set 38(H) of all bounded linear operators on a 
complex Hilbert space H. In this case (*) is the adjoint map and ||-|| is the 
operator norm. 

An element A of a C*-algebra is self -adjoint if A * = A. It is straightforward 
to show that the set of all self-adjoint elements of a C*-algebra form a 
distributive Segal algebra. In this case the symmetrized product A ° B = 
U.(A +B) 2 -A 2 -B 2 ] takes the simple form A ° B = \(AB+BA). A distribu- 
tive Segal algebra is said to be special if it is isomorphic to the set of all 
self-adjoint elements of a C-algebra. Important unsolved problems are 
whether every distributive Segal algebra is special and if not to characterize 
special Segal algebras. These problems appear to be very difficult and it seems 
unlikely that an arbitrary distributive Segal algebra is special. However, all 
Segal algebras that have been encountered in physical situations have been 
special. For this reason and also because the theory of C*-algebras has a rich 
mathematical development it is postulated that the Segal algebra correspond- 
ing to a physical system is special. It can be shown that any state w of a special 

258 Uncertainty Principle and Foundations of Quantum Mechanics 

Segal algebra can be extended to a positive (i.e. <o(AA *) s 0), normalized (i.e. 
cod) = 1) linear functional on the C*-algebra. Having made this postulate, we 
can proceed to the study of C*-algebras interpreting the self-adjoint elements 
as observables and the positive, normalized, linear functional as states. 

One of the important consequences of this last postulate is that it provides 
the mechanisms for representing the elements of a Segal algebra as self-adjoint 
operators on a Hilbert space. This follows from the GNS construction (after 
Gelfrand and Naimark (1943) and Segal (1947)). We now develop the neces- 
sary material to understand this construction. 

A map tt of a C*-algebra 38 into the set 38(H) of all bounded linear operators 
on a Hilbert space H is said to be a representation of 38 if 

(1). Tr{aA 4-/85) = air{A)+pTr(B); 
(2). tt(AB) = tt(A)tt(B); 
(3). 7r(A*) = 7r(A)*; 
for all A B e 38 and complex numbers a, 0. It can be shown that if tt is a 
representation of 38, then \\tt(A )|| < \\A || for every A e 38. Thus a representation 
is automatically continuous. A representation tt: ® -> 38(H) is cyclic if there is 
a vector + eH such that the subspace ir(»)* = {ir(A)* : A e ®} is dense in H. 
In this case, # is said to be a cyclic vector for tt. Cyclic representations play an 
important role in the theory; in particular, it is easy to show that any represen- 
tation is the direct sum of cyclic representations. In many physical applications 
the vacuum state plays the part of a cyclic vector. 

A positive, normalized, linear functional on a C*-algebra 38 is called a state. 
Let tt- 38 -»• 38(H) be a representation of 38 and let ip e H, U\\ = 1. Then the 
functional <o(A) = (iff, ir(A)*) is a state called a vector state associated with the 
representation tt. A state is pure if it cannot be written as a convex combination 
of other states. Let tt: 3D -» 38(H) and tt': 38 -> 38(H) be two representations of 
38 If there is an isomorphism U: H+H' such that tt'(A) = t/ir(A)IT then tt 
and tt' are spafta//y (or «mtori/y) equivalent. A closed subspace M of H is 
invariant with respect to tt: 38^ 38(H) if tt(A)M<=M for every A e 38 A 
representation tt: 38 -» 38(H) is irreducible if the only invariant subspaces of H 
with respect to tt are {0} and H. Irreducible representations are the most 
economical in the sense that they cannot be written as a direct sum of 

Theorem 2. (The GNS Construction) Let 33 be a C*-algebra and let at be a 
state on 38. Then there exists a Hilbert space H and a cyclic representation 
77.,: 38 -» 38(H) with cyclic vector ^ e H such that <y(A) = <& *r(A W> for every 
A*e» If 77'- 38-* 38(H) is another cyclic representation with cyclic vector 
,/r'eH such that a>(A) = <</>', »r'(A)*'> for every A e33, then tt and tt' are 
spatially equivalent. Furthermore, tt„ is irreducible if and only if w is pure. 

The GNS construction has important physical consequences, one of which is 
the following. It is not the Hilbert space and the self-adjoint operators on it as 
postulated in the classical approach of Section 2 that contains the essence of the 

Gudder 259 

physical system, but it is the C*-algebra generated by the Segal algebra. The 
Hilbert space and the operators on it corresponding to the observables depend 
upon the state of the system and can be obtained via the GNS construction. 
There may be many inequivalent representations of a C*-algebra and the 
'right' one is determined by the state of the system. 

The above observation overcomes, to a certain extent, one of the weaknesses 
of the classical approach mentioned in Section 2, namely where the Hilbert 
space and self-adjoint operators come from. Another weakness mentioned in 
Section 2 that is overcome is the continuity condition placed on the states of the 
classical approach. In the algebraic approach, no such condition is imposed. 
The states are defined algebraically in terms of the physically natural conditions 
of linearity, positivity and normalization. In fact, there are more states in the 
algebraic approach than those given by the density operators of the classical 
approach. Furthermore, these extra states actually occur in physical situations. 
An example of such a state can be given as follows. Let si be the Segal algebra 
of all bounded self-adjoint operators on an infinite dimensional Hilbert space. 
Then si contains an operator A with non-empty continuous spectrum cr c (A). 
Let A ea c {A). It can be shown (Segal, 1947) that there exists a pure state at 
such that <u(A) = A. Now this pure state cannot have the form at (A) = (iff, Aif>) 
since then \p would be an eigenvector of A with eigenvalue A, which contradicts 
the fact that A ea c (A). Such states as the above are delta function-like or 
Schwartz distribution-like elements that lie outside the Hilbert space. It can be 
shown (Emch, 1972), however, that all states of si can be approximated in a 
certain topology by density operator states. 

Although we have now formulated the basic concepts of the algebraic 
approach, we have only scratched the surface of its later developments. We 
have not mentioned such important areas as physical equivalence, symmetry 
groups, representations of canonical commutation and anticommutation rela- 
tions, quasilocal field theories and applications to concrete problems. For 
further study the reader is referred to (Emch, 1972) and the modern literature. 

(C) Strengths and Weaknesses 

One of the strengths of the algebraic approach is that it is based on axioms that 
have more physical relevance than in the classical approach. This is especially 
true of the axioms for a Segal algebra. It has also clarified the existence of 
inequivalent representations of the CCR and CAR. It has enjoyed some 
notable successes that we have not had space to explore and is responsible for 
important applications to such areas as statistical mechanics and solid-state 

Weaknesses of this approach include some of the later axioms, especially the 
jump from a Segal algebra to a C*-algebra. What, for example, is the physical 
significance of the product in a C*-algebra? Furthermore, even with all the 
mathematical power that has been brought to bear, a satisfactory field theory 
has still not be developed. 

260 Uncertainty Principle and Foundations of Quantum Mechanics 

In the quantum logic approach the propositions of a physical system are taken 
as primitive axiomatic elements. The propositions correspond to yes-no (or 
true-false) experiments on the physical system. For example, suppose the 
physical system consists of a single particle and let E be a region of space. If a E 
is a counter which is activated if and only if the particle enters the region E, then 
a E corresponds to a proposition. This proposition is true if and only if a E is 
activated and the particle is in the region E. The propositions correspond to 
two-valued observables and it can be argued (we shall substantiate this later) 
that an observable can be decomposed into these simpler two-valued observa- 
bles. Thus a treatment of propositions is general enough to describe ^a 1 
observables. The standard references on this approach are Jauch (1968), 
Mackey (1963) and Varadarajan (1968). 

(A) Quantum Logics 

Let 0> o be a set of elements called experimental propositions. If a e 0> o then a is 
true or false depending upon the state of the system. But in quantum mechanics 
one cannot, in general, predict whether a proposition will be true even if the 
state is precisely known. All one can predict is the probability that a proposition 
will be true. Thus a state m can be thought of as a function from <P to the umt 
interval [0, 1] and m(a) for a e 0> o is interpreted as the probability a is true 
when the system is in the state m. If m (a ) = 1 then a is true with certainty in the 
state m. Now suppose a,beP and m(a) + m(b)*l for every state m. Since 
m(a) = 1 implies m(b) = 0, whenever a is true with certainty, b is false with 
certainty. In this case a and b can be interpreted as corresponding to non- 
interfering experiments and their truth or falsity can be verified simultane- 
ously In this case one can consider the experimental proposition c which is true 
with certainty precisely when a and b are both false with certainty. Then we 
should have m(c) + m{a) + m(b) = 1 for every state m. For example^ our 
counter experiment suppose E 1 and E 2 are disjoint regions of space Then for 
any state m the probability the particle is in E l plus the probability the particle 
is inE 2 does not exceed unity, m(a El ) + m(a E2 )< 1. Nowif E 3 = (£i UE 2 ) (L is 
the complement of the set E) we should have m(a E3 ) + m(a B2 ) + m(a El )- 1. 
Such considerations also carry over to sequences of propositions. 

A proposition system is a pair {0> o , M) where 0> o is a non-empty set and M is a 
non-empty set of functions from @ into [0, 1] satisfying: 
Axiom A. For any sequence a 1; a 2 , . . . e& such that m(a,) + m(a,):< 1, / *j, 
for every meM, there exists b e& such that m{b) + m(a 1 ) + m(a 2 ) + . . . - 1 
for every meM. 

Axiom A is the only axiom that we shall impose on the system. We call the 
elements of M states and the element of 0> o experimental propositions. Now 

Gudder 261 

suppose a, be 0*0 and m(a) = m{b) for every meM. Then a and b are 
physically indistinguishable and we write a ~ b. It is clear that ~ is an 
equivalence relation and we denote the equivalence class containing a by a. 
Furthermore, if m e M we define m(a) = m(a). We denote the set of equiva- 
lence classes by 9 and call the elements of 0* propositions. We call the pair 
(0>, M) a quantum logic. Thus a quantum logic is a pair (9, M) satisfying Axiom 
A (with 0>o replacedby 0>) together with the condition m{a) = m{b) for every 
meM implies a = b. The quantum logic will be the main framework of our 
study and for simplicity we shall drop the bars in the following. 

Our next order of business is to prove the main structure theorem for 
quantum logics. But first we need some definitions. Let (0>, ^) be a partially 
ordered set with first and last elements and 1, respectively. An orthocomp- 
lementation on 9 is a map a>-^a' from 9 to 0* with the following properties: 

(1). a" = a for every ae@; 
(2). a < b implies b'<a'; 
(3). a v a' = 1 for every aef. 

In (3) a v a' denotes the least upper bound of a and a'. 

If (9, <) has an orthocomplementation ('), then (9, =£, ') is called an 
orthocomplemented poset. It is easily verified that in an orthocomplemented 
poset if a v b exists then so does a' a b' and a' a b' = (a v b)'. If a ^ b', we say 
that a and b are orthogonal and write alb. An orthocomplemented poset 
(9, <, ') is cr-orthocomplete if the following holds: 

(4). If Ci, a 2 , . . . is a sequence of mutually orthogonal elements in 9 then 
a 1 va 2 v.., exists. 

An orthocomplemented, o--orthocomplete poset {9, s, ') is orthomodular if 

(5). asj implies b = av(b a a')- 

A probability measure on a cr-orthocomplete poset (9, :£, ') is a map mitP-* 
[0, 1] which satisfies: 

(a) m(l) = l; 

(b) if a i, a 2 , . . . is a sequence of mutually orthogonal elements of & then 
m(va i ) = f 4 m(a i ). 

A set M of probability measures on (9, <, ') is order determining if m(a)^ 
m(b) for every meM implies a ^ b. 

Let us now return to the quantum logic (0>, M). If a, be 0> define a < b if 
m(a)<m(b) for every meM. The relation a<£ can be interpreted as a 
implies b. That is, b has a greater probability of being true than a. It follows that 
whenever a is true with certainty so is b. If a e 9, since m(a)^l for every 
meM, by Axiom A there exists be 9 such that m(b) = l-m(a) for every 
meM. We then write 6 = a' and call b the orthogonal complement of a. We can 
interpret a' as the proposition which is true if and only if a is false. We have 
thus defined a relation < on 9 and a map ('):*?-* ^. We now prove our main 

262 Uncertainly Principle and Foundations of Quantum Mechanics 

structure theorem. This theorem is due to Maczynski (1974) and the proof is a 
simplification of his. 

Theorem 3. ($>,M) is a quantum logic if and only if (0\ <,') is a cr- 
orthocomplete orthomodular poset and M is an order-determining set of 
probability measures on 0*. 

Proof. First suppose (0», M) is a quantum logic. It is clear from the definition 
that (@, <) is a poset. It is also clear that (') satisfies conditions (1) and (2) of an 
orthocomplementation. For ae^,we have m{a) + m(a') = 1 for every m e M 
so by Axiom A there is an element Oe <3> such that m(0) + m(a) + m(a') = 1. It 
follows that ro(0) = for every meM. Define 1 = 0'. Notice that 0< a < 1 for 
every a e & so and 1 are the first and last elements of 0>, respectively. If 
b>a,a' then m(a),m(a')^m(b). Hence m(a) + m(b')^l and 
m(a') + m(b')<l. Then a,a',b' satisfy the condition of Axiom A so there 
exists ce@ such that m(c) + m{a) + m(a') + m(b') = l for every meM. But 
then m{c) = m(b') = for every meM. Hence m(b) = 1 for every m e M and 
b = 1. Hence (3) is satisfied and (') is an orthocomplementation. Now albif 
and only if m(a) + m(6)<l. Thus if a^,a 2 ,... is a sequence of mutually 
orthogonal elements, then by Axiom A there exists be& such that 
m(b) + m( ai ) + m(a 2 ) + ... = l. Hence m(ft') = ImW for every meM. It 
follows that b'>a lt a 2 ,.... Now suppose c>a u a 2 ,.... Then 
m(a,) + m(c')^ 1 for every meM. Hence the sequence c', a t , a 2 , . . . satisfies 
the ' condition of Axiom A so there exists d e & such that 
m(d) + m(c')+I micii) = 1. It follows that b' = va, and 0> is o-orthocomplete. 
Furthermore, since m(va,) = m(6') = !>(«.) and m(l) = l, it follows that 
every meM is a probability measure on 0>. It is obvious that M is order 
determining. To show that 0> is orthomodular, suppose a^b. Then a Lb' and 
since a<avb' = (a'Ab)',a±a'Ab. Hence for every meM 

m[av(a' Ab)] = m(a) + m(a' Ab) = m(a) + l-m(avb') 
= m(a) + l-m(a)-m(b') = m(b) 

Therefore, b = av{a' a b). Conversely, suppose (0>, <, ') is a cr-orthocomplete 
orthomodular poset and M is an order-determining set of probability measures 
on 3>. Then m(a) = m(b) for every m eJ^ implies a=b. Suppose a u a 2 , ... is a 
sequence in 0> such that m (a,) + m (a,) <1, i #;',foreverym e^.Thenm(a,)< 
1 -m(aj) = m(aj) so a, 1 a,, i 5*7. Hence b = va, exists and m(b') +1 m(a,) = 1. 
Thus Axiom A holds and ($>, M) is a quantum logic. 

We thus see that there is no difference between a quantum logic and a 
o-orthocomplete orthomodular poset with an order-determining set of proba- 
bility measures. Notice that a quantum logic need not be a lattice (that is, a v b 
and a a b need not exist). For example, let (1 = {1, 2, 3, 4, 5, 6} and let 0> be the 
collection of subsets of £1 with an even number of elements. Order & by 
inclusion and let (') be the usual set complementation. For i = 1, . . . , 6 define 
for ae& m,(a)=l if iea and m t (a) = if i£a. Then if we let M = 

Gudder 263 

{m, ,: i = 1, . • • 6} it is easily verified that (0>, M) is a quantum logic. However, $> 
is not a lattice since, for example, {1, 2, 3, 4} a {2, 3, 4, 5} does not exist. 

We say that two propositions a, bare compatible (written a ** b ) if there exist 
mutually disjoint propositions a u b x and c such that a = a 1 \/c,b = bivc. We 
shall see that compatible propositions are ones that can be verified simultane- 
ously; that is, propositions whose experiments do not interfere. Notice, if a lb 
then a++b and 0«-»a, \**a for every a e &. Physically, our interpretation of 
a < Z> demands that a*-*bifa^b. This is indeed the case since if a ^ b then by 
{5)b = a v(b a a') and a = a vO where a±(b Aa'). 

We now show how observables can be defined. If x is an arbitrary observable 
and EeB(R) is a Borel set on R, then the pair (x, E) corresponds to the 
proposition: 'the observable x has a value in the set E\ Thus if (0*, M) is a 
quantum logic, an observable can be thought of as a map x:B(R)^>8P. 
Furthermore, an observable should satisfy: 

(1). x(R) = l; 

(2). If EHF= 0,thenx(E)±x(F); 

(3). If EieB(R) is a sequence of mutually disjoint sets, then x(UEi) = 
v x(E t ). 

The reader can easily justify these three conditions. 

Two observables x, y are compatible (written x+*y) if x(E)++y(F) for all 
E,FeB(R). We shall show later that observables which are compatible may be 
thought of physically as being simultaneously measurable. 

The reader should note that we have constructed a generalized probability 
theory. Instead of being a Boolean cr-algebra of subsets of a set, our events 
(propositions) are more general, belonging to a less restrictive structure. The 
usual probability measures are replaced by states and the random variables by 
observables. Notice that if x is an observable and m a state, then the probability 
that x has a value in E e B(R) when the system is in state m is m[x(E)]. Before 
proceeding further, let us consider two examples of quantum logics. 

Example 1. Let D. be a phase space and let B(Cl) be the Borel subsets of O. 
B(Cl) may be thought of as the set of mechanical events. It is easily checked that 
B(H), under set inclusion and complementation, is a o--orthocomplete, 
orthomodular poset (in fact, it is a Boolean cr-algebra). The set of states M are 
the conventional probability measures on ^(fl) and these are order determin- 
ing so (B(Cl), M) is a quantum logic. If x is an observable, it follows from a 
theorem of Sikorski (1949)-Varadarajan (1962) that there is a measurable 
function f:Cl-*R such that x(E)=r\E) for every EeB(R). Thus observa- 
bles are just inverses of dynamical variables. We thus see that the quantum 
logic generalizes classical mechanics and also the conventional Kolmogorov 
(1956) formulation of probability theory. It is easily checked that all events 
(propositions) and observables are compatible in this example. 

Example 2. Let H be a complex Hilbert space and let & be the collection of 
all closed subspaces of H. Ordering *3> by inclusion and defining the comple- 

264 Uncertainty Principle and Foundations of Quantum Mechanics 

ment of a subspace as its orthocomplement it is easily checked that 9> is a 
<r-orthocomplete, orthomodular poset (in fact, a lattice). Furthermore, the set 
of states M, by Gleason's theorem (see Section 2), are given by density 
operators and are order determining. Hence (&,M) is a quantum logic. 
Identifying closed subspaces with their orthogonal projections, an observable 
may be thought of as a projection-valued measure. Since, using the spectral 
theorem, there is a one-to-one correspondence between projection-valued 
measures and self-adjoint operators, we may identify observables with self- 
adjoint operators. Thus the quantum-logic approach, in this case, reduces to 
the classical approach of Section 2. It is straightforward to show that a,b<=@ 
are compatible if and only if their corresponding projections commute. It 
follows that two observables are compatible if and only if their corresponding 
self-adjoint operators commute. 

Let us now return to quantum logics. If x is an observable we call {x (E):Ee 
B(R)} the range of x. It is easily verified that the range of an observable is a 
Boolean <r-algebra. 

Lemma 4. (Varadarajan, 1962) Two propositions are compatible if and only if 
they are in the range of a single observable. 

This last lemma justifies the fact that compatible propositions are simultane- 
ously verifiable, since to verify two compatible propositions one need measure 
only a single observable. 

Now let x be an observable and let u : R -+ R be a Borel function. There is an 
operational significance for u (x); namely, if x has the value A e R, then u (x) has 
the value w(A). This is equivalent to saying that the proposition 'w(x) has a 
value in EeB(R)' is the same as the proposition 'x has a value in u l (E)\ 
Motivated by this, we define u(x) as u(x){E) = x[u~\E)] for all E e B(R). It is 
easily checked that u(x) is an observable and that u(x)++x. 

Theorem 5. (Varadarajan, 1962) Two observables x, y are compatible if and 
only if there exists an observable z and Borel functions u, v such that x = u(z) 
and y = v(z). 

This last theorem shows that, physically, compatible observables are simul- 
taneously measurable (i.e. non-interfering) since to measure two compatible 
observables one need only measure a single observable. 

Space does not permit a comparison of the quantum logic approach to the 
algebraic approach of Section 3. However, we mention that it can be shown 
that the approaches are not equivalent. It can also be shown that the Segal 
algebra of Section 3 can be embedded in a weaker type of quantum logic than 
that considered here (Gudder and Boyce, 1970; Plymen, 1968). 

(B) Quantum Systems 

Although some illuminating and physically valuable results have been obtained 
in the study of quantum logics, their structure is mathematically so general that 

Gudder 265 

they have not been particularly useful for concrete calculations. A quantum 
logic is so general that it is far from the concreteness of the Hilbert space of the 
classical approach. What is needed is something like the GNS construction of 
the algebraic approach. However, such a construction is impossible unless 
more axioms are imposed on the quantum logic. Such steps have been taken 
(Piron, 1964; Zierler, 1961; Varadarajan, 1968) and theorems have been 
found which represent the propositions of certain types of restricted quantum 
logics as closed subspaces of a Hilbert space. However, many of the additional 
axioms do not have convincing physical justification. This point is, of course, 
arguable. The usual additional axioms are that & is a complete, atomic, 
semi-modular lattice. 

There is another approach which does bring the Hilbert space forward 
without imposing additional artificial axioms on the quantum logic {0>, M). This 
is to adjoin physical structures to (^,M) such as physical space, position 
observables and symmetry. After all, in the known physical systems there is 
always more than just the quantum logic. There is a space in which the system 
lives, usually some sort of symmetry involved and some kind of distinguished 
observable such as position. We now briefly explore this approach. 

First of all, any physical system concerns a phenomenon that takes place in 
some kind of physical arena which we call physical space. Mathematically, we 
shall assume that physical space & is a locally compact Hausdorff space with 
second countability. (We include these mathematical esoterics for preciseness. 
For the definitions of these terms see any book on topology or the reader can 
assume y=R 3 which is general enough for many discussions and which is the 
prototype of such spaces.) In a concrete physical situation, £f might be R , or 
R", or perhaps four-dimensional space-time, or some region in these spaces. 
Now many of the propositions in *3> are concerned with the location of the 
physical system in Sf. If such propositions can be verified in the laboratory we 
call the system localizable. We shall now define this term mathematically. 

Let B{Sf) denote the Borel sets in Sf, and if E e B{$f) let the proposition that 
the physical system is located in E be denoted by X(E). Thus A" is a map from 
B(Sf) into 0>. It is clear that X is an observable based on Sf so 

(1). X(9) = U 

(2). If Ef\F= 0, then X(E)±X(F); 

(3). X(\JE,) =VX(E t ) if E, ClE, = 0,i* j. 

We require that X be maximal in ascertain sense. Specifically, let R{X) £ & 
denote the range of X and let w be a probability measure with domain R (X). 
We say that X is maximal if every probability measure w on R (X) has a unique 
extension weJt. We say that a physical system is localizable if there exists a 
maximal observable (called a position observable) X:B{Sf) -* $P. 

There are physical systems that are not localizable. However, as indicated by 
the work of Jauch and Piron (1967) many of these systems can be handled using 
a weaker notion of position observable. In this section we shall henceforth only 
consider localizable systems. 

266 Uncertainty Principle and Foundations of Quantum Mechanics 

We now consider symmetries. A symmetry may be thought of intuitively as 
being a transformation that maps the system into another system which is 
physically identical with the original one except for a relabelling. IfaeP, then 
after a symmetry transformation we get a new proposition Wa. Thus a 
symmetry induces a map W: & -> &. Since W just relabels the propositions, W 
is a bijection on 9 that preserves all the operations on 9>; that is, W is an 
automorphism on 0>. We denote the automorphisms on $> by aut (9) and 
notice that aut (9) is a group. 

Usually symmetries come from transformations on the physical space &. We 
say that a group G is a transformation group on & if G is a locally compact 
topological group with second countability for which there exists a continuous 
map from G x 9> onto & denoted by (g, s) -* gs such that 

(1). s -» gs is a homeomorphism of Sf with itself for every geG; 
(2). gi(g 2 (*)) = (gi • gi)(s) for every g lt g 2 e G; 
(3). if si, 5 2 e 5", there exists g g G such that s t = gs 2 (transitivity); 
(4). gs = s for every s 6 & if and only if g = c, the identity element of G 

Now if a transformation group is a symmetry for the system it must induce an 
automorphism group on 9. Let if = (3 s , J<) be a quantum logic, Sf a physical 
space and X a position observable. A symmetry group on (if, $f, X) is a pair 
<S = (G,W) where G is a transformation group on y and W is a group 
homomorphism W: G -► aut (0>) (i.e. W glg2 = W gl W g2 ) such that 

(Wl). g -» m( W g (a)) is continuous for every meM,aed>; 
(W2). AT(g£) = W g (X(£)) for every g g G, E g B(50 (covariance). 

Condition (Wl) is a natural continuity requirement while (W2) is a covariant 
condition which gives the natural interpretation that W g (X(E)) is the proposi- 
tion that the system is located in the set gE. We call W a projective representation 
of G in d>. We thus see that g-> W g gives a generalization of a continuous 
unitary representation of a group and (W2) generalizes Mackey's irnprimitivity 
relation (Mackey, 1968). 

This completes the background for our extended axiomatic structure. We 
shall call a four-tuple (if, V, X, <S) where 2 = (9, M) is a quantum logic, Sf a 
physical space, X a position observable and <3 = (G,W)a symmetry group, a 
quantum system. We take the viewpoint that the important physical properties 
of a physical system are described by a quantum system. 

Let us consider an example. This is the usual formulation of a spinless, 
non-relativistic particle moving in one-dimensional space. The set of proposi- 
tions 9 is the lattice of closed subspaces (or equivalently, the lattice of 
orthogonal projections) of the complex Hilbert space L 2 (R, n) where n is 
Lebesque measure on R. Let M be the set of pure states of the form 
m f (P) = (f,Pf) where feL 2 (R,n),\\f\\ = 1 and />0. Then M is an order- 
determining set of states and (0>, M)'\s& quantum logic. Let G be the group of 
translations on R ; that is, for a e R, A -> A + a is a transformation group on R. 

Gudder 267 

Let U a :L 2 (R,ti)^L 2 (R,fji.) be the map (U a f)(\) = f(A -a). Then U a is a 
unitary operator and if we define W a P=U a PLT a 1 for every Pe&, then 
W a g aut (2P) and (G, WO is a symmetry group. The position observable is given 
by (X(E)f) (\) = Xe (A )/(A ) where xe is the characteristic function of E g B (R ) . 
We now show that X is maximal. Let v be a probability measure on R (X) and 
define v (E) = v(X(E)),EzB(R). Then v is a measure on B(R) that is 
absolutely continuous relative to n (i.e. fi(E) = implies v (E) = 0). Hence by 
the Radon-Nikodym theorem there is a unique feL 1 (R,fj.),f^0 such that 
p (E) = \ E fdfi for every E eB(R). Let g =/ 1/2 so that 

v (E) = f g 2 dfi = f ;feg 2 d M = (g, *(£< )g> 

Then m g eJ and since m g (X(E)) = v(X(E)) for every EeB(R),wesee that 
m g is the unique extension of v in M. 

This last example is canonical in a certain respect. We shall show that 
corresponding to any quantum system (if, y, X, <S) there is an underlying 
Hilbert space and constructs similar to those in the above example that mirror 
much of the axiomatic structure of (2, Sf, X, < S). 

A cr-finite measure fi on B(S) is quasi -invariant relative to G if fi(E) = 
if and only if fi (gE) = for every g&G. 

Lemma 6. (Gudder, 1973c) Let (2, Sf, X, <$) be a quantum system. Then there 
is a non-zero o--finite quasi-invariant measure /j, on B(50 such that for every 
m g M the measure E-* m(X(E)) is absolutely continuous relative to ft. 

The space L 2 (F, fi) will serve as the underlying Hilbert space where F is a 
certain subset of Sf. Two states m t and m 2 are orthogonal if there is a g 9 such 
that mx(a) = m 2 (a') = 0. A set of vectors H is said to generate a Hilbert space 
H if the closed linear hull of H is H. 

Theorem 7. (Gudder, 1973c) Let (2, 3>, X, &) be a quantum system. Then 
there exists FeB(&) with the following properties: (a) fi(F)^0, (b) if Ee 
B(F) and m(-E) # 0, then X(£) 5^ 0, (c) there is a one-to-one map m^m from 
M onto a generating set H in the complex Hilbert space L 2 (F, /x) that 
preserves orthogonality. 

We shall now see that the Hilbert space L 2 (F, fj,) derived in Theorem 7 
mirrors many of the structural properties of the quantum system (if, y, X, C S). 
Let us first consider the symmetry group C S = (G, W). Let Vo be the complex 
vector space generated by Ji={m:me M], where m -*■ m is the map given in 
Theorem 7. Now W g e aut (9) can be thought of as a map from M into M 
defined by (W g m)(a) = m(W g (a)),meM, a g0>,§gG. ^ 

Then W g induces a natural transformation W on Jt defined by Wgih = 
( Wgm) ' . This map is well-defined since m -*■ m is injective. We next extend W g 
to Vq by linearity. If g e G define ju, g by n g (E) = n[g~ 1 (E)] for every EeB(F). 
Then /u. g is absolutely continuous with respect to fi. Let d/j. g /d/j. be the 
Radon-Nikodym derivative. 

268 Uncertainty Principle and Foundations of Quantum Mechanics 

Theorem 8. (Gudder, 1973c) The map g+W g is a continuous unitary 
representation on V and (W g f)(\) =/(g _1 A)[cWdM(A)] 1/2 for every fe V . 

Now L 2 (F, fi) = V the closure of V and W g can be extended to a unitary 
transformation on L 2 (F, /x) which we also denote by W g . We thus see that the 
states M are represented by certain unit vectors M in the Hilbert space L 2 (F, fi ) 
and that the symmetry group $ is represented by a unitary representation W g 
of G on L\F, fi). We represent X on L 2 (F, n) by 

X(E) = proj. on the closed span of {m : m(X(E)) = 1} 

Theorem 9. (Gudder, 1973c) If (&, ST, X, <S) is a quantum system, then 
[X(E)f](\) = Gfefl(A), /e L 2 (F, fi). 

Denote the lattice of all orthogonaljjrojections on L 2 (F, fi) by #. We thus 
see that X is a position observable on §>. Now W g induces an automorphism on 
& defined by W*P = WgPW^ for every Pe&. 

Theorem 10. (Gudder, 1973c) If {££, Sf, X, <S) is a quantum system, then 
X[g(E)]= W*X(E) for all geG, EeB(F) and (G, W g ) forms a symmetry 
group on #. 

Letting 4" = (#, J) and # = (G, W g ) we see that the structure of a quantum 
system {<£,Sf,X,<0) is mirrored by the Hilbert space quantum system 
(£ # X, #). 

(C) Strengths and Weaknesses 

One of the strengths of the quantum-logic approach is that its axioms are 
simple and physically justified. This approach has contributed to the under- 
standing of many quantum-mechanical concepts. However, one of its weaknes- 
ses is that it is too general for use in concrete problems. The quantum systems 
studied above are an improvement but the representation of a quantum system 
{% Sf, X, <S) as a Hilbert space quantum system (.£, 5?, X, % given above is not 
completely satisfactory for the following two reasons. Except for the proposi- 
tions in the range of X, there is no isomorphism between & and §> so the 
propositions in general are not represented by & Second, there is no provision 
for distinguishing between pure and mixed states since all the states in M are 
mapped onto pure states in M. 


In this approach the states are taken as the undefined primitive axiomatic 
elements. The important property of states, as far as this approach is con- 
cerned, is that they are closed under the formation of convex mixtures. Now it is 
easy to define a convex combination of elements in a linear space. However, the 
linear space is artificial and devoid of physical meaning for states. One cannot 

Gudder 269 

add states or multiply them by scalars to get other states. Only the operation of 
forming convex combinations of states has meaning. For this reason an abstract 
definition of convex mixtures is defined that is independent of the concept of 
linearity. This approach to convexity originated with Stone (1949) and von 
Neumann and Morgenstern (1944) and later developed by Mielnik (1968, 
1969), Ludwig (1968), Davies and Lewis (1970) and others. 

(A) Convex Structures 

We begin with a framework due to Noll and Cain (1974). Let S be the set of 
states for a physical system. We would like to define a notion of mixing a finite 
number of elements of S according to a given recipe. These recipes are 
described by listing the finite number of elements that are to be mixed together 
with the proportion of each element. Thus a recipe can be thought of as a 
function / : 5 -* [0, 1] such that 

(1). f(s) = except for finitely many s's; 
(2). £.*/(*) = 1 

If we define the support supp / of a function / to be the set on which / does not 
vanish, then condition (1) is the same as supp/ is finite. We define the simplex 
AS of S to be the set of recipes on S. 

There is a natural map 8:S->AS whose values S s are given by S s (t) = 1 if t - s 
and S s (?) = if t # s. Thus 8 S is the characteristic function of the singleton set {s}. 
Notice that every recipe has the form /=Z"=i A A,, where A, ; >0, £" =1 A, = 1. 
Furthermore, there is a natural map T : AA5 -*■ AS given by T(F) = £ /6 As F(f)f. 
Finally, with every map M:AS^>S we can associate, in a natural way, a 
corresponding map M: AAS-> AS defined as 

M(F)(s) = I {F(f) :/e AS, M{f) = s) 

We say that M : AS-+S is a mixing operation for S if M is surjective and 
satisfies M°T = M°M. A less concise but more illuminating way of writing this 
last equation is 

M(lA,/;) = M(ZA,-5 M(/() ) (4) 

when ^ € AS, A, > 0, 1 A, = 1. 

We can interpret M as follows. If fe AS is a recipe then M(f) is the state 
resulting from mixing the s e S in the relative amounts f(s) prescribed by the 
recipe. Intuitively M(f) means mix the states according to the recipe /. 
Condition (4) means that if we mix a set of mixtures, we obtain the same result 
as when we apply each mixture individually and then mix. 

It is instructive to see that the usual notion of a mixture in a linear space 
satisfies (4). Suppose then that S is a real vector space. Then, in this case, a 
mixing operation M:AS-*S should satisfy M(£ A(5 Si ) = X AjS, (A, > 0, £ A,. = 1), 
or more generally M (X A,/-) = £ A*Af (/•) and M(S Si ) = s t . But then 

M£ k,f t ) = Z AW,) = I AMW = M£ A,5 M(/i) ) 

270 Uncertainty Principle and Foundations oi Quantum Mechanics 

Let us now return to the general case. It follows from condition (4) that if M 
is a mixing operation, then M(f) = M(S M(f) ) for every /eAS.Since M is 
surjective, given seS there is /e A5 such that M{f) = s. Hence M(S s ) = s for 
every seS. This is interpreted as meaning that a mixture for a recipe containing 
one ingredient is identical to that ingredient. 

Let M be a mixing operation on 5. We define a map from [0, 1] x S x S into 
5, (A, 5, /)~<A, s, t) as follows: (A, 5, t) = M[X8 S + (1 -A)5,]. We can interpret 
<A, s, t) as a mixture of the states s and t in the ratio A : (1 - A). The following 
lemma lists the important properties of <A, s, t). This lemma is proved by a 
straightforward application of the definition. 

Lemma 11. If M is a mixing operation, the map (A, s, t)^>(X, s, t) satisfies the 
following conditions: 

(Ml). (l,s,t) = s; 

(M2). <A, s,s) = s; 

(M3). (\,s,t) = (l-\,t,s); ,_, v v ,_ 

(M4). <A,s,</*,f,t;» = <A+(l-A)At,<A[A+(l-A)/i] Sm),*) whenever 

We call a map (A, 5, t)^(X,s, t) satisfying (M1)-(M4) a binary mixing 
operation. The next theorem shows that any mixing operation can be obtained 
by successive applications of a binary mixing operation. 

Theorem 12. (Noll and Cain, 1974) If (A, s, f)-*<A, s, t) is a binary mixing 
operation, then there exists a unique mixing operation M such that (A, s, t) = 
M[XS S + (1 - A )S,]. Furthermore M can be obtained by a successive application 


Because of Theorem 12 we can work exclusively with the binary mixing 
operation (X, s, t)^>(X, s, t) and we shall do so in the following. A mixing 
operation is distinguishing if the corresponding binary mixing operation 

(M5). If <A, s, h) = <A, s, t 2 ) for some s e S and A * 1, then h = t 2 . 

Distinguishability is a reasonable physical condition which we shall later see 
is equivalent to having enough observables to distinguish between states. We 
call a set with a distinguishing mixing operation a convex structure. The axiom of 
this approach is the following. 
Axiom. The set of states for a physical system form a convex structure. 

The standard example of a convex structure is a real vector space V in which 
(X,s,t) = Xs + (l-X)t. The reader can easily check that this gives a convex 
structure. When we consider convex sets in vector spaces we always assume 
they are equipped with the above convex structure. 

It is convenient to also consider a framework which is much more general 
than a convex structure . A convex prestructure is a set 5 together with a function 
(A, s, f)>-»<A, s, t) from [0, 1] x S x S into S. This concept is so general that any 

Gudder 271 

non-empty set is a convex prestructure. This is because no conditions are 
placed upon the map (A, s, t)>-+(X, s, t). 

If Si and S 2 are convex prestructures, a map A : 5i -* S 2 is affine if 
A(X, s, f)i = <A, As, At) 2 for all A e [0, 1], s, t e Si. We say that S x and S 2 are 
isomorphic if there is an affine bijection from S t to S 2 . An affine functional f is 
an affine map from a convex prestructure S to the real line R; that is 
/((A, s, t)) = Xf(s) + (1 - A )f(t) for all A 6 [0, 1], s, t in S. We denote the set of 
affine functionals on S by S* and say that S* is total if for any s,teS with s # t 
there exists /e S* such that f(s)^f(t). 

Suppose S is a convex prestructure that corresponds to the set of states for 
some physical system. Since a bounded observable has an expectation in every 
state, the bounded observables can be thought of as functionals on S. It is also 
physically reasonable that these functionals are affine. Furthermore, since a 
state is determined by the expectation values it gives to bounded observables it 
is reasonable to assume that S* is total. 

Theorem 13. A convex prestructure S is a convex structure if and only if S* is 

Proof. For sufficiency it is a simple matter to show that conditions (M1)-(M5) 
hold if S* is total. For example, for (Ml), since /«1, s, t)) =f(s) for all/e S* we 
have (1, s, t) = s. Necessity will follow from the second representation theorem 
proved later. 

A convex substructure of a convex structure S is a subset Si£S which 
satisfies <A, s, t) e S t whenever s, t e S u X e [0, 1]. A subset F s S is a face if F is 
a convex substructure and if (A. s,t)eF for some A e (0, 1) implies s,teF. An 
element s € S is an extreme point if {s} is a face. Thus s is an extreme point if and 
only if s is not a mixture of other elements. 

(B) Representation Theorems 

In this section we give two vector space representation theorems for convex 
structures. But first we need some definitions concerning convex sets. Let S be 
a convex subset of a real vector space V (i.e. x,yeS implies Xx + (1 - A )y e S 
for every A e [0, 1]). The hyperplane, cone and subspace, respectively, generated 
by S are defined as follows: 

H(S ) = I £ XtX, , : £ A, = 1, x, e S \ 
K(S ) = {t A^AiXJ.JceSo} 
V(S ) = {t A*x,: A, €£,*,€ So} 

272 Uncertainty Principle and Foundations of Quantum Mechanics 

Two vector spaces are isomorphic if there exists a linear bijection from one to 

the other. 
We first consider the question of uniqueness of representations. 

Theorem 14. (Uniqueness) Let S be a convex prestructure and let 7\ and T 2 
be affine bijections from X onto convex subsets 7\(S), T 2 (S) of two real vector 
spaces Vj and V 2 , respectively. If £ H(Tt(S)) and £ H(T 2 (S)) then V(7\(S)) 
and V(T 2 (S)) are isomorphic. 

Proof. The function g = T 2 ° 7T 1 is an affine bijection from T^S) to T 2 (S). We 
first extend g to K^S)). First, if yeX(r,(S)) then y =LA,x,, A,>0, x t e 
US). Hence y = I, A, I, (A,/I ; . A ; )x, = Ax where A > and x e 7\(S).We now 
show that the representation y = Ax is unique. Indeed, suppose y-Xx-fiz 
where X,fi>0 and x^eT^S). Then if X*n we have = Ax-/iz = 
(A-/t)[A(A-/t)" 1 x-/*(A-/*r 1 z]. Since 0£H(7\(S)) the second factor on 
the right-hand side is not 0. Hence A = /u, which is a contradiction. Thus A - ^ 
and hence x = z. Define g(y) = Ag(x). It is easy to see that the extended 
g : KiT^S)) -> tf(T 2 (S)) is a bijection. The following shows that g is additive on 


g(Ax+Aty) = g{(A+^)[<A+At) a x+/t(A+/t) y]} 

= Ag(x)+/ig(y) = g(Ax) + gOuy) 

Also, g is homogeneous on K^T^S)) since 

g(A (/ue)) = g(A*a) = Ajtg(x) = Xg(fix) 

for \,n>0,xe Ti(S). We now extend g to V^S)). Suppose y e V(Ti(5)) 
and y = I A;X„ A, e J?, x,- € Tt(S). Then the positive and negative coefficients can 
be grouped so that y = Xx-y.z where A, fi s= and x, z e 7\(S). Thus y has the 
form y = «-o where^r^)). Define g on V(Ti(5)) by g(y) = 
g(u)-g(v). This extended g is well-defined since if u-v = u 1 -v l ,u 1 ,Vie 
K(Ti(S)), then « +u x = Ui + v so by the additivity of g on X(ri(S)) we have 
g(u) + g(v l ) = g(u 1 )+g(v)andg(u)-g(v) = g{u 1 )-g(v l ).Thatg:V(T 1 (S))-* 

V(T 2 (5)) is linear and bijective is now easily verified. 

We say that a convex prestructure 5 is represented as a convex set S in a real 
vector space V if there exists an affine bijection T : S ■* S , with & H(S ) and 
V(S \= V. It follows from the uniqueness theorem that if S is represented as 
convex sets S u S 2 in vector spaces V x and V 2 , respectively, then V t and V 2 are 
isomorphic. Thus representative convex sets and their vector spaces are unique 
up to an isomorphism. Furthermore, if / e 5*, then/ ° T l is an affine functional 
on 5 and by a method similar to that used in the proof of Theorem 14, f°'T~ 
has a unique extension into a linear functional plus a constant on V(S ). 

Theorem 15. (First Representation Theorem) A convex prestructure S can 
be represented as a convex set 5 if and only if 5* is total. 

Proof. Let F : S -*■ S be an affine bijection where 5 is a convex subset of V. It is 
well-known that the set of linear functionals V* on V are total over V. 

Gudder 273 

Restricting the elements of V* to S we get a total set of affine functionals for 
S . Now if / e V*, then / ° F e S* so S* is total. Conversely, suppose S* is total. 
For xeS define J(x) : S* -*■ R by J(x)f=f(x). Clearly 5* is a vector space under 
pointwise operations and /(x) e S** so that J(S) c S**. Now 7(5) is a convex 
set since for J(x), J(y) 6 7(5) and A e [0, 1] we have /e 5*, 

[A/(x) + (l-A)/(y)y=A/(x) + (l-A)/(y)=/«A,x,y)) = /«A,x,y»/ 

so A/(x) + (l-A)/(y)e/(5). To show / is injective suppose x^y&S. Then 
since 5* is total there is/e 5* such that/(x) #/(y) so /(x) # J(y). We now show 
that 1 H(J(S)). If € //(/($)) then there exist A, -e R, x, e S, i = 1 , . . . , n, with 
IA, = 1 such that lAi/(x,) = 0. Then lA,/(x,) = for every feS*. Letting 
/i = lwe obtain the contradiction £ A, = 0. 

Now let 5 be a convex structure. If S* is total the last theorem represented S 
as a convex set in S**. We now give a different representation which although 
isomorphic to the last one by the uniqueness theorem, has a form that is useful 
in many applications. A cone is a set K = {X, Y,Z,.. .} on which there is a 
binary operation (X, Y)*-*X+ Y and a scalar multiplication (A, X)^>\X, A e 
R + (i.e. A >0), X, Ye K satisfying: 

(1). X+Y=Y+X; 

(2). X+(Y+Z) = (X+Y)+Z; 

(3). if X+y=AT+Z,then Y = Z; 

(4). \(X+Y) = \X+\Y; 

(5). (X+/x)X = XX+fiX; 

(6). \((iX) = (*tJi)X; 

(7). l-X = X 

Theorem 16. (Second Representation Theorem) A convex structure S can be 
represented as a convex set. 

Proof. We first show that S can be extended to a cone. Let P = 
{(A, x):A >0, xeS}. We define addition and scalar multiplication on P by 
(A,x) + (ji, y) = (A+/u.)(A(A+/i) _1 ,x, y) and A(/u,x) = (A/i,x). A straightfor- 
ward verification using the properties of a convex structure shows that P is a 
cone. We next show that P can be extended to a vector space. Let V = 
{(X, Y) : X, Y 6 P}. Define the relation (X, Y) ~ (X, Y) if and only if X+ Y' = 
y+A"'. This is easily seen to be an equivalence relation onPxP. Denote the 
equivalence class containing (X, Y)by[(X, Y)] and let V={[X, Y)]:X, YeP}. 
Define addition on Vby[(AT, Y)]+[(X, Y')] = [(X+X, Y+Y')l To show this 
operation is well-defined, suppose (X, Y)~(X U Yi) and (X, Y')~{X' U Y\). 
Then X+Y^Y+Xi and X'+Y^Y'+Xl Hence X+X'+Y 1 +Y[ = 
Y+ Y' +X 1 +X[ and (X+X, Y+ Y) ~ {X x +X[, y x + Yi). Under addition V 
is an abelian group with zero [{X, X)]. Define a scalar multiplication by real 
numbers as follows. If A >0, then A [(AT, Y)] = [(XX, AY)]; if A=0, then 
A [(AT, Y)] = [(X,X)]; and if A <0, then \[(X, Y)] = [(-A Y, -XX)]. As with 
addition, this operation is well-defined. It is straightforward to show that Vis a 

274 Uncertainty Principle and Foundations of Quantum Mechanics 

vector space. Now define the maps A : S^P and B :P^> V by Ax = (1, x) and 
BX = [(X+Y,Y)]. The second map is well-defined since (X+Y, Y)~ 
(X+Z, Z) for every Y, Z e P. Hence B° A : 5 -* V. Now B is additive since 

B(X+Z) = [(X+Z+Y, Y)] = [(X+Z + 2Y,2Y)] 
= [(X+Y, Y)]+[(Z+Y, Y)] = BX+BZ. 
Also B is homogeneous since for A > 0, 

B(\X) = [(XX+Y, Y)1 = [(\X+\Y,\Y)] = \[(X+Y, Y)] = XBX 

Furthermore, A is an affine map since 
A«A,Jt,y» = (l,<A,x,y» = (A+(l-A),<A,x,y» 

= (A,x) + (l-A,y) = A(l,*) + (l-A)(l, Y) 

= AAc + (l-A)Ay 

It follows that B ° A :5-> V is affine. It is easily checked that A and B are 
injective so B ° A is injective. Also it is clear that B ° A (5) is convex and that 
V[B o A (S)] = V. Finally, suppose e #[B ° A (5)]. Then there exist X,, Z, 6 P 
and A, e R with I A, = 1 such that £ A,[(Xi +Z„ Z,)] = 0. Combining the posi- 
tive coefficients and negative coefficients, there exist A, p. >0, A ^ p, x, y e S, 
ZeP such that [((A,x)+Z,Z)] = [((p, y)+Z,Z)]. This implies that (A,x) = 
(/i, y). But then A = p which is a contradiction. 

A distance can be defined in a very natural way in a convex structure 5. If 
x, y e S, the closeness of x to y can be measured by comparing mixtures 
<A, x u x), (A, yi, y) of 5. If x and y are very close we would expect to find a 
mixture containing mostly x equal to a mixture containing mostly y; that is, 
<A,x 1 ,x) = (A,y 1 ,y) in which A is very small. Conversely, if (X,x u x) = 
<A, y i, y > and A is small we expect that x and y are close. Thus the parameters A 
such that (A, x u x) = (A, y u y) give a measure of the closeness of x and y. We 
thus define a distance function a as follows: 

o-(x,y) = inf{O^A^l:(A,Xi,x) = (A,y 1 ,y),x 1 ,y 1 eS} 

Notice that since <§, x,y) = {\,y,x) we have < a(x, y ) < \ for all x, y e S. It is 
sometimes useful to make a change of scale and define the distance function 
p( x> y ) = a (x, y)[l-<r(x, y)] _1 . Then 0<p(x, y) ^ 1 for every x,yeS. 

Using a representation of 5 it is straightforward to show that a and p are 
metrics. One of the important properties of a and p is that they are invariant 
under isomorphisms. That is, if A:S!->5 2 is an isomorphism, then 
a 2 (Ax, Ay) = o^x, y) and Pl (Ax, Ay) = p,(x, y) for all x, y e Sj. There is also 
a relationship between p and transition probabilities. One might expect, since 
0=£p(x, y)^ 1, that p has something to do with probabilities. Specifically if 
p(x, y) is small one might expect the transition probability from x to y to be 
large while for large p(x, y) a transition from x to y would be unlikely. This is 
indeed the case. In fact, in the classical approach, if <t> and i// are unit vectors 

Gudder 275 

corresponding to pure states x and y, it can be show that the transition 
probability \(<j>, <A)1 2 = l—p(x, y) 2 . For more details and other results the 
reader is referred to (Gudder, 1973a, b). 

We next briefly show how this approach can be carried further. Let 5 be the 
convex structure of states. Then by one of the representation theorems, S can 
be represented by a convex set 5 in a real vector space V= V(S ). The metricp 
on S can be transferred to 5 giving a metric p on 5 . It can be shown that there 
exists a unique norm ||-|| on V such that ||jk - y || = p (x, y) for every x, y e 5 . In 
one interpretation the states are thought of as 'unit beams' and the cone 
P = {(A, x) : A > 0, x € 5} of the second representation theorem is the space of 
beams. The functional t :P-+R + defined by t [(A, x)] = A is interpreted as 
giving the beam intensity. It is easy to see that r has a unique extension to a 
linear functional r on V and that t(X) = |WI f° r every XeP. The triple 
( V, P, t) is called a base normed space and is the basic framework for the 
operational quantum mechanics of Davies and Lewis (1970). 

(C) Strengths and Weaknesses 

The strengths and weaknesses of this approach are similar to those of the 
quantum-logic approach. The axioms are simple and physically motivated. 
Although the approach has important theoretical uses, its practical utility has 
not been exploited. An important unsolved problem in this respect is to 
characterize convex structures that are isomorphic to the set of density 
operators on a Hilbert space. 


Davies, E. B. and Lewis, J. T. (1970) 'An operational approach to quantum probability', Commun. 

Math. Phys., 17, 239-260. 
Dirac, P. A. M. (1930) The Principles of Quantum Mechanics, Clarendon Press, Oxford. 
Emch, G. G. (1972) Algebraic Methods in Statistical Mechanics and Quantum Field Theory, 

Wiley-Interscience, New York. 
Gelfand, I. and Naimark, M. A. (1943) 'On the imbedding of normed rings in the ring of operators 

in Hilbert space', Mat. Sb.N.S., 12 [54], 197-217. 
Gleason, A. M. (1957) 'Measures on the closed subspaces of a Hilbert space', /. Math. Mech., 6, 

Gudder, S. (1973a) 'State automorphisms in axiomatic quantum mechanics', Intern. J. Theoret. 

Phys., 7,205-211. 
Gudder, S. (1973b) 'Convex structures and operational quantum mechanics', Commun. Math. 

Phys., 29, 249-264. 
Gudder, S. (1973c) 'Qnantum logics, physical space, position observables and symmetry', Rep. 

Math. Phys., 4, 193-202. 
Gudder, S. and Boyce, S. (1970) 'A comparison of the Mackey and Segal models for quantum 

mechanics', Intern. J. Theoret. Phys., 3, 7-21. 
Haag, R. and Kastler, D. (1964) 'An algebraic approach to quantum field theory,' /. Math. Phys., S, 

Jauch, J. (1968) Foundations of Quantum Mechanics, Addison Wesley, Reading, Mass. 
Jauch, J. and Piron, C. (1967) 'Generalized localizability', Helv. Phys. Acta, 40, 559-570. 

276 Uncertainty Principle and Foundations of Quantum Mechanics 

Jordan, P., von Neumann, J. and Wigner, E. (1934) 'On an algebraic generalization of the quantum 

mechanical formalism', Ann. Math., 35, 29-64. 
Kolmogorov, A. N. (1956) Foundations of the Theory of Probability, Chelsea, New York. 
Ludwig, G. (1968) 'Attempt of an axiomatic foundation of quantum mechanics and more general 

theories IIP, Commun. Math. Phys., 9, 1-12. 
Mackey, G. W. (1963) The Mathematical Foundations of Quantum Mechanics, W. A. Benjamin 

Inc., New York. . 

Mackey, G. W. (1968) Induced Representations and Quantum Mechanics, W. A. Benjamin Inc., 

New York. . 

Maczynski, M. J. (1974) 'When the topology of an infinite-dimensional Banach space coincides 

with a Hilbert space topology', Studio Math., 49, 149-152. 
Mazur, S. and Ulam, S. (1932) 'Sur les transformations isometriques d'espace vectoriels normes', 

C.R. Acad. Sci. Paris, 194, 946-948. 
Mielnik, B. (1968) 'Geometry of quantum states', Commun. Math. Phys., 9, 55-80. 
Mielnik, B. (1969) 'Theory of filters', Commun. Math. Phys., 15, 1-46. 
Noll, W. and Cain, R. N. (1974) 'Convexity, mixing, colors, and quantum mechanics', Preprint: 

Department of Mathematics, Carnegie-Mellon University, Pittsburgh, Pa. 
Piron, C. (1964) 'Axiomatique quantique', Helv. Phys. Acta, 37, 439^468. 
Plymen, R. J. (1968) 'C*-algebras and Mackey's axioms', Commun. Math. Phys., 8, 132-146. 
Schatten, R. (1950) A Theory of Cross -Spaces, Ann. Math. Studies 26, Princeton University Press, 

Princeton, N.J. 
Segal, I. E. (1947) 'Postulates for general quantum mechanics', Ann. Math., 48, 930-948. 
Sherman, S. (1956) 'On Segal's postulates for general quantum mechanics', Ann. Math., 64, 

Sikorski, R. (1949) 'On the inducing homomorphisms by mappings', Fund. Math., 36, 7-22. 
Stone, M. H. (1930) 'linear transformations in Hilbert space III. Operational methods and group 

theory', Proc. Nat. Acad. Sci. U.S.A., 16, 172-175. 
Stone, M. H. (1949) 'Postulates for the barycentric calculus', Ann. Mat. PuraAppl., (4) 29, 25-30. 
Varad'arajan, V. S. (1962) 'Probability in physics and a theorem on simultaneous observability', 

Commun. Pure Appl. Math., 15, 189-217. 
Varadarajan, V. S. (1968) Geometry of Quantum Theory I, Van Nostrand, Princeton, N.J. 
Von Neumann, J. (1931) 'Die Eindeutigkeit der Schrdingerschen Operatoren', Math. Ann., 104, 

Von Neumann, J. (1932) Grundlagen der Quantenmechanik, Springer, Berlin; enghsh translation 

by R. T. Beyer, Princeton University Press, Princeton, N.J., 1955. 
Von Neumann, J. and Morgenstern, O. (1944) Theory of Games and Economic Behavior, 

Princeton University press, Princeton, N.J. 
Wigner, E. P. (1931) Gruppentheorie und ihre Anwendugn, Vieweg, Braunschweig; English 

translation by J. J. Griffin, Academic Press, New York, 1959. 
Zierler, N. (1961) 'Axioms for non-relativistic quantum mechanics', Pac. J. Math., 11, 1151-1169. 


Intermediate Problems for Eigenvalues in 
Quantum Theory 


Ambassador College, Pasadena, U.S.A. 


In any general study of quantum theory one sooner or later becomes involved 
in a discussion of eigenvalue problems. In fact, eigenvalue problems provide 
not only a link with classical mechanics, but are actually, in a sense, typical of 
quantum mechanics even in classical problems. For instance, if we consider 
such classical problems as those of the vibrations of strings, membranes and 
plates and of the buckling of beams and plates, we immediately see quantum- 
like phenomena. 

The frequencies of vibration and buckling loads occur only at discrete 
numerical values. Modes of vibration and buckling 'jump' from one state to 
another. As a curiosity, one could say that these classical problems are more 
purely quantum-like than quantum problems in that the phenomenon of a 
continuous spectrum does not occur in classical cases. 

In all these problems, whether we consider frequencies of vibration, buckling 
loads or energy levels, the common ground is, of course, the eigenvalue 
problem. Therefore, it is not at all surprising that methods, techniques and 
theoretical results dealing with eigenvalue problems of classical mechanics can 
be carried over and applied to problems of quantum mechanics. 

Weinstein's methods of intermediate problems and their variants, to which 
the present chapter is devoted, are particularly exemplary of this kind of 


Let § be a real or complex Hilbert space having the scalar product (u, v) and let 
H be a self -adjoint linear operator denned on a subspace 3) dense in &. In 
problems discussed here, H is bounded below and the lower part of its 
spectrum consists of a finite or infinite number of isolated eigenvalues A x ^ A 2 ^ 
. . . each having finite multiplicity. Let \ x denote the lowest point (if any) in the 


278 Uncertainty Principle and Foundations of Quantum Mechanics 

essential spectrum of H. The point A«, could be a non-isolated eigenvalue of 
finite or infinite multiplicity, an isolated eigenvalue of infinite multiplicity or a 
spectral point which is not an eigenvalue. There may be point eigenvalues, even 
isolated point eigenvalues, which are above A*,. However, when we enumerate 
the eigenvalues A u A 2 . . . , we mean the isolated eigenvalues that are below A*,. 
We shall denote by u u u 2 . . ■ a corresponding orthonormal sequence of 

The selection of operators having these properties is motivated by the fact 
that many problems in classical and quantum mechanics involve operators of 
this type. Since the Schrodinger operators for hydrogen, helium, etc., have such 
spectra, we call this type of operator 'type-.?". 


It is possible to solve exactly for the eigenvalues of only a few operators of 
type-y. In most cases one must devise methods of estimating the eigenvalues. 
Since an approximation is useless without also specifying its accuracy, the best 
approximations for eigenvalues have come by means of complementary 
methods, that is methods which approximate the eigenvalues from above and 

The Rayleigh-Ritz Method has been used widely to obtain approximations 
from above (upper bounds) to the eigenvalues of operators of type-^. This 
method is fairly straightforward to apply and with the advent of high-speed 
computers has given results of remarkable accuracy. For a detailed discussion 
of the Rayleigh-Ritz Method, see the books of Gould (1966) and Weinstein 
and Stenger (1972). 

The problem of finding lower bounds to eigenvalues is intrinsically much 
more difficult. The first major breakthrough in this area was made by Alexan- 
der Weinstein (1935, 1937) who introduced intermediate problems to deter- 
mine lower bounds to the buckling load and frequencies of vibration of a 
clamped square plate. In solving these problems Weinstein used classical 
techniques involving natural boundary conditions and Lagrange multipliers. 
Soon the problems were reformulated in the language of Hilbert space. 

Without going into detail, the basic idea of intermediate problems is as 
follows. Given an eigenvalue problem 

Hu =Au 

we first find another eigenvalue problem, called the base problem, 

Au =Aw 

whose eigenvalues are all lower than those of H. We then build a sequence of 
intermediate problems depending on a finite number of functions which link the 
base problem to the given problem and whose eigenvalues are intermediate 
between those of H and those of A. Finally, we must solve for the eigenvalues 

Stenger 279 

of the intermediate problems, thereby obtaining lower bounds to the eigen- 
values of H. 

In the problems solved by Weinstein the given problem turned out to be of 
the form 

Hu=Au-PAu=ku, Pu = 

where P is the orthogonal projection operator onto a subspace 5g of £>. The 
base problem here is Au = Am. If we select a finite number of functions 
Pu Pi, ■ ■ ■ , p n from $£ and let P n be the orthogonal projection operator onto the 
subspace spanned by the functions p u i = 1, 2, . . . , n we can formulate the nth 
intermediate problem 

Au- P n Au = Am, P n u = 

which has eigenvalues intermediate between those of A and of H. This is called 
an intermediate problem of the first-type. The eigenvalues of the intermediate 
problems are obtained from the Weinstein determinant 

W(\) = det {(Rm, p k )}, i,k = l,2,...,n 

where i? A is the resolvent operator of A, i.e. i? A = (A -A/) -1 . " 

If no special assumptions are made on the choice of functions p„ it is possible 
that an eigenvalue of the base problem is also an eigenvalue of the intermediate 
problem. We call such eigenvalues persistent. Since the determinant W(A) may 
be singular at a persistent eigenvalue, in numerical applications a so-called 'big' 
Weinstein determinant is used which avoids possible singularities. An analog- 
ous situation occurs in problems of the second-type discussed later, see 
Weinstein and Stenger (1972) for details. 

Intermediate problems of the first type have been applied numerically to 
problems of classical mechanics and have also had theoretical applications, 
some of which are related to quantum theory, see for instance Stenger (1968). 
A complete discussion of the numerical and theoretical applications of the first 
type of intermediate problems is given in the book of Weinstein and Stenger 

A second -type of intermediate problems was introduced by Aronszajn 
(1951). The basic pattern here is the same as in intermediate problems of the 
first-type, although the form of the problems is somewhat different. In particu- 
lar, the given problem admits the decomposition 

Hu = Au+Bu = \u 

where A is of type-5^ and B is positive. The base problem is Au = \u and the 
nth intermediate problem is given by 

n n 

Au + £ Z ("» Bpi)(SijBpj = Au 
where {ft,} is the inverse matrix of {(Bp h p t )}. For a suitable choice of a, and q t 

280 Uncertainty Principle and Foundations of Quantum Mechanics 

the intermediate problem can be written in the more general, yet simpler, form 


Au + £ atj(u, qj)qj=ku 

7 = 1 

The eigenvalues of the intermediate problems are obtained from the Weinstein 

V{k) = det {S ik + a,GR A (?„ q k )}, i,k = 1,2, ... ,n 

This determinant has also been called the modified Weinstein determinant and 
the Weinstein-Aronszajn determinant. 

Up to now all numerical applications of intermediate problems to quantum 
theory have involved problems of the second-type or variants of problems of 
the second-type, as is illustrated in subsequent sections. 

It should be mentioned here that while intermediate problems are not a part 
of perturbation theory, the solution of problems of the second-type has led to a 
number of contributions to perturbation theory, see for instance Kuroda 
(1961), Kato (1966), Stenger (1969), Weinstein and Stenger (1972) and 
Weinstein (1974). Several attempts have also been made to 'reduce' inter- 
mediate problems of the first-type to those of the second-type, e.g. Kuroda 
(1961), Fichera (1965) and Kato (1966). While such a reduction can be made in 
certain cases under severe limitations, the result in every such case actually 
leads to a more complicated problem than the original, see Stenger (1970a, b) 
and Weinstein and Stenger (1972). 


The first application of intermediate problems to quantum theory was given by 
Bazley (1959, 1960, 1961) who was then joined by Fox. Their collaboration, as 
well as individual research, produced many significant numerical and theoreti- 
cal results, e.g. Bazley and Fox (1961a, b; 1962a, b,c; 1963a, b; 1964; 
1966a, b, c), Fox (1972). A more complete bibliography, a survey of their work 
and tables of numerical values are given in Weinstein and Stenger (1972). 

Bazley first considered the problem of estimating the eigenvalues of the 
Hamiltonian operator for the helium atom. We now give an overview of the 
application to helium, omitting details which may be found in the original 
papers and book cited above. 

If we neglect nuclear motion, relativistic effects and the influence of spin and 
denote by (x u y u z x ) and (x 2 , y 2 , z 2 ) the coordinates of the two electrons, the 
Schrodinger equation for helium is 

Hu = -^AiW -§A 2 « -(2/rJu -(2/r 2 )w +(l/r 12 )u = Xu 

where A, is the Laplacian in the coordinates (x h y„ z t ). 

r^tf + yl + zDK i = l,2 

r 12 = [(x 2 -xi) 2 + (y2-yi) 2 + (z2-zi) 2 f 

Stenger 281 

While the domain of definition of H from the point of view of the physicist 
was historically only vaguely defined, Kato (1951a, b) considered the operator 
H on the Hilbert space of square-integrable functions (i.e., if 2 space) over 
six-dimensional coordinate space. He showed that H admits there a unique 
self-adjoint extension, that is, that H is essentially self-adjoint. In other words 
he proved that the closure of H is self-adjoint, see also Kato (1966). Of course, 
' the closure of H is no longer a differential operator in the usual sense, but it 

reduces to the differential operator for sufficiently regular functions. If we 
wanted to be notationally strict, we should use different symbols for the formal 
differential operator and the self-adjoint extension in Hilbert space. However, 
for the sake of this exposition we avoid encumbering the notation and use the 
same symbol to denote the formal operator and its corresponding Hilbert space 

It is by no means a foregone conclusion that// is of type-y. It is therefore 
significant that Kato (1951a, b) showed that the spectrum of H begins with 
isolated eigenvalues, each of finite multiplicity. 

Following the general pattern of intermediate problems, we first find a 
suitable base problem. In the present case the base problem is 

Au = -%AtU -\L 2 u-{2/r l )u-(2/r 2 )u = Xu 

This operator A is the Hamiltonian of a system composed of two independent 
hydrogen-like atoms and admits a unique self-adjoint extension having the 
same domain as H. The eigenvalues and eigenfunctions of A are well known, 
see Kemble (1958). The eigenvalues are given by 

-2[(l/n 2 ) + (l/« 2 )], n u n 2 =l,2,... 

with multiplicities n\n\, and the corresponding eigenfunctions are products of 
hydrogenic wave functions. Since the continuous spectrum of A consists of the 
interval [-2, 00), the lower part of the spectrum begins with isolated eigen- 
values, that is to say, A is also of type-5^. 

If we now decompose the given operator H as // = A +B, where B is the 
non-negative operator given by 

Bu = (l/r 12 )u 

we are in a position to form intermediate problems of the second-type. 

At this point one could attempt to solve intermediate problems with arbi- 
trary functions p,. Such an approach, however, would be fruitless since the 
resulting intermediate problems would not lend themselves to numerical 
solutions. In order to overcome this difficulty, Bazley introduced a special 
choice of functions which led to an algebraic problem, readily solvable by using 
computers. The special choice of Bazley in certain respects parallels the 
distinguished choice used earlier by Weinstein in problems of the first-type. 

282 Uncertainty Principle and Foundations of Quantum Mechanics 

In order to form the special choice, we let u < 0) denote the (known) eigenf unc- 
tions of the base problem Au = Au. We choose vectors p t such that 

Bpi = uf\ i = 1, 2, .... » 

In this way, the nth intermediate problem becomes 

Au+l £ft y («,«?>P = Aw 
,=1 i-1 

where {)S /; } is matrix inverse to {(Bp h Pj)}. 
In the case of helium the spectrum of A begins with isolated eigenvalues 

Af = -2[1 + (1A 2 )], k = l,2,.-. 

having eigenfunctions 

«f = -2(l/4ir)R io(ri)*io(r2) 

uf = (l/v / 24ir)[i? 1 o(ri) J R fc o('-2) +^io(r 2 )^fco('-i)], k=2,3,... 

where the elements R k0 axe the normalized hydrogen radial wave functions. 

By observing that B is easily invertible and yields the special choice 
A = ri2M (°> (j = l,2,3), Bazley solved the third intermediate problem and 
obtained the lower bounds -3.063^ EtfS) and -2.165 5 <E(2 l S) for the 
S-states of parahelium. 

Bazley obtained an improved lower bound for £(l x 5) by using the lower 
bound for E{2 1 S) in Temple's formula. This lower bound combined with the 
Rayleigh-Ritz upper bound computed by Kinoshita (1957) gives the quite 
accurate estimate -2.9037474<£(1 1 5)< -2.9037237. 

While we have concentrated our attention here on the first and second 
eigenvalues, it should be noted that intermediate problems may be used to 
obtain lower bounds to an arbitrary number of eigenvalues at the lower part of 
the spectrum. 


In many numerical applications, if we take the most obvious or natural base 
problem and attempt to solve intermediate problems relative to that base 
problem, we are confronted with transcendental equations which cannot be 
readily solved by numerical means. However, these computational difficulties 
may be circumvented by the following truncation of the base operator. 

The idea of truncating the base operator was introduced by Weinberger 
(1959) in problems of the first-type and was later developed by Bazley and Fox 
(1961b) in problems of the second-type where it was successfully applied to 
problems of quantum theory. 


Stenger 283 

We begin by considering the spectral representation of the original base 
operator A, namely 

Au = 1 Aj 0) («, uf >) )u ( ° ) + f A &E k u 

i JAco-0 

where the sigma may denote a finite sum or an infinite series. For a fixed 
positive integer N, we define the truncation operator of order N by 

T N u = I kf\u, uf^uf+k^ f °° dE k u 

We assume without loss of generality that A $' < A n+i- The truncation operator 
is generally simpler than the original base operator since it consists of a 
negative-semidefinite operator of finite rank plus a multiple of the identity. 
Such an operator has only a finite number of distinct eigenvalues and no 
continuous spectrum. 

The advantage gained by using the truncation operator is that the resolvent 
of T N has the form 

Rx(T N )p< 

» (p,uW i 

k aP-a 

j (0) 

-1 — A L ; = i J 

Therefore, in the intermediate problems the resulting Weinstein determinant is 
a rational function instead of a (generally) transcendental function, thus 
reducing the difficulty of numerically determining the eigenvalues of the 
intermediate problems. 

Bazley and Fox (1961a) applied the method of truncation to the helium 
atom. Even with a truncation of order two and only the second intermediate 
problem they obtained an improvement of the lower bounds Bazley (1961) 
obtained by the special choice. 

In this case the truncated base problem is given by 

T 2 u=k?(u,uT)u,+kf{u,uf)uf 

+Af[«-(«,« ( 1 °Vf-(«,«n«f] 

= A« 

where \f\ A^ 0) , Af , uf } and «f are as given in Section 3. Letting Bu = 
(l/r 12 )w as before and choosing functions 

p 1 = [(1.5) 3 /^]e- 15 (r 1 +r 2 ) 

p 2 = [5 V5)/7r]r 12 exp [-(5) 1 (r t + r 2 )] 

Bazley and Fox were able to form an intermediate problem whose eigenvalues 
could be computed by hand. 

Another problem to which Bazley and Fox applied truncation was the radial 
Schrodinger equation. In order to solve this problem the given eigenvalue 

284 Uncertainty Principle and Foundations of Quantum Mechanics 


-dVM* 2 - z[U - e- a *)/x]<A = £<A 

is transformed into another eigenvalue problem in the following way. 

While ordinarily one would consider E as the eigenvalue, here we fix the 
energy E and take the charge z as the eigenvalue. The numerical results may 
then be inverted to give the energy eigenvalues E. The reason for taking z as 
the eigenvalue is that the resulting base problem has a pure point spectrum. 

We put E = -k 2 and introduce the transformations 

ii(f) = rty(0 

to obtain 

t = 2kx, 

_l(^ + f_ti M=A (l-e-^ 2 > 
dA dt) At 

where A = z/2k. We now have an eigenvalue problem of the form 




d / dw\ t 2 + l 

__d_( d«\ 



Bu=t-" /2k u 

We note that the base operator A has known eigenvalues, 

Aj 0) = i, i = l,2,... 
and normalized eigenfunctions 

M <°> = (,l/;!/ 4 )LK0e-' /2 , ,- = 1,2,... 

where L\ is the first derivative of the ith Laguerre polynomial. The given 
problem has a pure point spectrum Ai<A 2 <. . . diverging to infinity and 

Af<A„ i = l,2,... 
Here the intermediate problems are of the form 

T N u = X(I-BP n )u 

where T N is the truncation operator previously defined and P" denotes a 
projection on functions p lt p 2 , ■ ■ ■ , p n , orthogonal with respect to the inner 
product [u, v] = (w, Bv). In solving this problem Bazley and Fox put p, = w, 
(;' = 1, 2, . . . , n) and obtained the eigenvalues as solutions of an algebraic 
system. ^ 

Stenger 285 

For this example, the value of a was fixed and the substitution k = a/2 was 
made so that E = -a 2 /A. Upper bounds obtained by solving a fourth-order 
Rayleigh-Ritz problem based on the trial functions «i 0> , « 2 0) , "3 0) and uT 
together with the lower bounds obtained from the intermediate problems 
provided the estimates: 

1.2587a < z x < 1.2590a 

2.3944a <z 2 ^ 2.4 164a 

3.4207a <z 3 < 3.5576a 


Another variant in intermediate problems of the second-type namely, the 
generalized special choice, was used by Bazley and Fox (1961a) in estimating 
the eigenvalues of an anharmonic oscillator. 
Here the differential equation is given by 

-u"+x 2 u+ex*u=\u, -oo<jc<oo 

where e > 0. Restricting our discussion to the even symmetry class, the base 

Au = —u"+x 2 u =A« 

has well-known eigenvalues 

AP = 4/-3, 


The corresponding eigenfunctions are the linear oscillation eigenfunctions 

«i 0) = Q exp (-x 2 /2)H 2i - 2 (x), i = 1, 2, . . . 

Here Q = 2 1_i [(2/ -2)!] _i 7r _ * and H, is the /'th Hermite polynomial. 
Letting B be the operator defined by 

Bu = ex A u 

we once again have the given problem in the form 

Au+Bu = Aw 

We recall from Section 3 that a special choice of functions p, is given by 
Bpi = u - 0) . In this problem, however, Bazley and Fox introduced a generalized 
special choice given by 


i = l,2, 


It turns out that by putting p, = wf * and using a recurrence relation for Hermite 
polynomials that the symmetric matrix {/3 tf } is readily obtained. The eigen- 

286 Uncertainty Principle and Foundations of Quantum Mechanics 

values of the intermediate problems are then computable as the roots of a 
linear system. 

For various values of e the intermediate problems yielded lower bounds to 
the first five eigenvalues, which complemented by the Rayleigh-Ritz upper 
bounds (also given by Bazley and Fox), demonstrated the accuracy of the 
method. In fact, when compared graphically, the results for the first eigenvalue 
computed by intermediate problems and computed by perturbation theory 
show the overwhelming superiority of intermediate problems over perturba- 
tion theory. The Rayleigh-Ritz upper bounds and intermediate lower bounds 
were actually indistinguishable on the graph used, while even for quite small 
values of e the perturbation theory values were not even close. 


One of the more significant advances in the applicability of intermediate 
problems to quantum mechanics was recently given by Fox (1972) who 
introduced a method of constructing intermediate operators (called compari- 
son operators in Fox's terminology) which make it possible to compute lower 
bounds for the eigenvalues of the Schrodinger operators for atoms and ions 
having three or more electrons. 

The basic pattern of given problem, base problem and intermediate prob- 
lems, is also in Fox's work. However, the actual form of the intermediate 
problems is new and fundamentally different from previously used inter- 
mediate problems of the first- and second-types. 

The difficulty in dealing with the Schrodinger equation of atoms more 
complicated than helium is that the lowest point in the essential spectrum of the 
base operator lies below (or very close to) the first eigenvalue of the given 
operator. In intermediate problems of the second-type the intermediate prob- 
lem is formed by adding an operator of finite rank to the base operator. Since 
the essential spectrum of the base operator is invariant under the addition of 
any compact operator, the eigenvalues of the intermediate problems would not 
provide meaningful numerical results. 

In order to overcome this difficulty Fox constructs intermediate problems by 
adding operators to the base operator which provide, even though these 
operators are non-compact, intermediate problems whose eigenvalues can be 
numerically determined. In order to accomplish this Fox used a technique of 
separation of variables. Moreover, he had to introduce and develop important 
results in the spectral theory for the separation of variables in Hilbert space, see 
Fox (1968, 1975). 

Let us now illustrate the general concepts of Fox's method. We again 
suppose that the given problem is of the form 

Hu = Au + Bu = \u 

Moreover, we assume thaJxA can be resolved by elementary separation of 

Stenger 287 

variables and that the separation of variables used for A also allows B to be 
written as a certain sum relative to this separation of variables. A complete 
discussion of what such a decomposition involves would require us to go into 
some detail regarding tensor products of Hilbert spaces, see Fox (1968, 1972, 

Instead, for the purpose of the present chapter, we consider the specific 
problem of the Schrodinger equation for the non-relativistic fixed-nucleus 
model for the lithium atom without spin interaction. Here the given problem is 



- (3/r 3 )u + (l/r 12 )u + (1 /r 13 )u + (l/r 23 )« 

= Aw 

where A is the nine-dimensional Laplacian, r, is the (Euclidean) distance from 
the nucleus to the /th electron (i = 1, 2, 3) and r tj is the distance between the /th 
electron and the /th electron. 
The base problem 

Au = -\ Au - (3//i)k - (3/r 2 )w - (3/r 3 )« = Ah 

factors (separation of variables) into three resolvable hydrogen-like operators, 
see Kemble (1958). In order to decompose B we consider the nine- 
dimensional coordinate system (x u y u z u x 2 , y 2 , z 2 , x 3 , y 3 , z 3 ) where (*,-, y„ z ( ) 
gives the position of the /th electron (/ = 1, 2, 3). We let & denote the if 2 space 
of functions defined on (x u y u z u x 2 , y 2 , z 2 , x 3 , y 3 , z 3 ), let &„ denote the £ 2 
space of functions defined on (x t , y„ z t , x h y„ z f ), and let £>, denote the i? 2 space 
of functions defined on {x h y t , z,). 

If we now define B 12 to be multiplication by r7 2 in £>i 2 and let I 3 be the 
identity operator on £> 3 , we can form a tensor product operator B l2 = B 12 x I 3 
which gives multiplication by r\\ in $. 

The operators B l3 and B 23 may be formed in an analogous manner. Now 
the operator B can be decomposed into 

B — Bi 2 +Bi 3 +B 


which is the decomposition necessary for Fox to form intermediate problems 
and apply his method. 

Instead of approximating B by operators of finite rank, as in intermediate 
problems of the second type, here one approximates the operators 2?„ by 
operators of finite rank, say B%. While B" 2 is an approximation to B 12 of finite 
rank, the tensor product operator B" 2 = B " 2 x I 3 is a non -compact approxima- 
tion to B 12 . Similar non-compact approximations to B 13 and B 23 may be 
formed, say B" 3 and B 23 . The intermediate operator which is then given by 

A+B n 12 +B n 13 + m 3 

consists of the base operator A plus non-compact operators. This means that 
the essential spectrum may be displaced and lower bounds obtained. 

288 Uncertainty Principle and Foundations of Quantum Mechanics 

It should be noted that once the decomposition of B is achieved the 
intermediate operators are formed in ways similar to those in problems of the 
second type, that is, by using special choices, generalized special choices and 


These methods have been applied by Fox and Sigillito (1972a, b, c) to obtain 
bounds for the energy levels of radial lithium. The radial model is a simplifica- 
tion of the usual fixed-nucleus non-relativistic model based on the assumption 
that the electron distributions depend on the distances of the electrons from the 
nucleus only and not on the angular variables. This simplified model was used 
to test the methods numerically while avoiding the complexities of angular 

The Hamiltonian for radial lithium is given by 

H = -t (A£/2 + 3//i)+I l/p„ 

where A* is the radial Laplacian 


■A drj 

and pa = max [r„ />]. The operator acts on functions of the three radial distances 
that are square integrable with respect to the weight function r x r 2 r 3 . 

Here the base operator is the sum of three resolvable one-electron hy- 
drogenic Hamiltonians and the intermediate operators are formed by treating 
each pairwise coupling l/p v separately as a two-electron operator. Then, the 
resulting intermediate problems can be solved numerically by diagonalizing 
Hermitian matrices. 

For total spin 5 equal to \ and § upper bounds computed by the Rayleigh- 
Ritz method together with lower bounds obtained from the eigenvalues of the 
intermediate operators were given by Fox and Sigillito (1972b) and are 
reproduced in Table 1. 

Table 1. Bounds for energies of radial lithium 

S = \ 

-7.620 < A, £-7.488 
-7.493 £A 2 £ -7.324 
-7.457 <A,< -7.275 

-5.220<A 1 =£-5.204 
-5.169sA 2 <-5.149 
-5.160<A 3 £-5.170 


Stenger 289 

given in the table, shows that the method is indeed successful in displacing the 
essential spectrum, since the lowest points in the essential spectrum of the 
corresponding base operators are -9 and -5.625, respectively. 

Finally, we would like to mention that a similar approach to constructing 
intermediate operators for the lithium atom was published by Reid (1972). 
However, Reid's contribution appears to be purely formal and does not touch 
on the subtleties of separation of variables in Hilbert space and the properties 
of the spectra of operators acting on tensor products of Hilbert spaces. The 
contributions of Fox to the spectral theory of such operators, on the other hand, 
are a necessary and major part of the application of intermediate problems to 
lithium and other atoms. 


In the brief exposition of the present chapter we have given an overview of the 
applicability of the methods of intermediate problems to quantum theory. As a 
result of our attempt to emphasize what we feel to be the highlights in this 
regard, we have necessarily omitted many contributions to intermediate prob- 
lems for eigenvalues, which are important and interesting in their own right. 
For instance, we were not able to go into detail here about the work of Lowdin 
and his collaborators which includes applications of intermediate problems and 
closely related methods to problems of quantum chemistry. A fairly complete 
bibliography of their work may be found by referring to Lowdin, (1965, 1968), 
Stenger (1974) and Weinstein and Stenger (1972). On the other hand, we did 
devote a little more space to recent developments by Fox in applying inter- 
mediate problems to lithium, since the latter results appeared after the 
publication of Weinstein and Stenger (1972) and could not be included there. 

Anyone interested in more details about solving intermediate problems of 
the first- and second-types, the applications of intermediate problems to 
classical mechanics, the relationships and various inequalities for eigenvalues 
and results in functional analysis connected with intermediate problems, is 
referred to the books Gould (1966) and Weinstein and Stenger (1972), to the 
large number of primary references cited in these books, and the more recent 
papers given in the references here. 


-7.418S A, 


-5.123 ssAu 


The bounds were subsequently improved by Fox and Sigillito (1972c). In 
particular, the lower bound for the first point in the essential spectrum f or S = 2 
was increased to -7.294. This result, together with the lower bound -5.123 

Aronszajn, N. (1951) 'Approximation methods for eigenvalues of completely continuous symmet- 
ric operators,' Proc. Symp. Spectral Theory and Differential Problems, Stillwater, Oklahoma, pp. 

Bazley, N. W. (1959) 'Lower bounds for eigenvalues with application to the helium atom,' Proc. 
Nat. Acad. Sci. U.S.A., 45, 850-853. 

Bazley, N. W. (1960) 'Lower bounds for eigenvalues with application to the helium atom', Phys. 
Rev., 129, 144-149. 

Bazley, N. W. (1961) 'Lower bounds for eigenvalues', /. Math. Mech., 10, 289-308. 

290 Uncertainty Principle and Foundations of Quantum Mechanics 

Bazley, N. W. and Fox, D. W. (1961a) 'Lower bounds for eigenvalues of Schrodinger's equation', 

Ba^tey N*W and' Fox, D. W. (1961b) 'Truncations in the method of intermediate problems for 

lower bounds to eigenvalues', J. Res. Nat. Bur.Std. Sec. B, 65, 105-111. 
Bazley, N. W. and Fox, D. W. (1962a) 'Error bounds for eigenvectors of self-adjoint operators , J. 

Res! Nat. Bur. Std. Sec. B., 66, 1-4. ..,,,»,, ,u ou .„ -x 

Bazley, N. W. and Fox, D. W. (1962b) 'A procedure for estimating eigenvalues , /. Math. Phys., J, 

Bazley N. W. and Fox, D. W. (1962c) 'Lower bounds to eigenvalues using operator decomposi- 
tions' of the form B*B\ Arch. Rational Mech. Anal., 10, 352-360. 
Bazley, N. W. and Fox, D. W. (1963a) 'Error bounds for expectation values , Rev. Mod. Phys., 35, 

Bazley, N. W. and Fox, D. W. (1963b) 'Lower bounds for energy levels of molecular systems', /. 

Math. Phys., 4, 1147-1153. . 

Bazley, N. W. and Fox, D. W. (1964) 'Improvement of bounds to eigenvalues of operators ot the 

form T*T, J. Res. Nat. Bur. Std. Sec. B., 68, 173-183. 
Bazley, N. W. and Fox, D. W. (1966a) 'Methods for lower bounds to frequencies of continuous 

elastic systems', Z. Angew. Math. Phys., 17, 1-37. . .- , * 

Bazley, N. W. and Fox, D. W. (1966b) 'Error bounds for approximations to expectation values ot 

unbounded operators', /. Math. Phys., 7, 413-416. , 

Bazley, N. W. and Fox, D. W. (1966c) 'Comparison operators for lower bounds to eigenvalues , J. 

Reine Angew. Math., 223,142-149. 
Fichera, G. (1965) Linear Elliptic Differential Systems and Eigenvalue Problems (Lecture Notes in 

Mathematics), Springer, New York. . 

Fox D W (1968) Separation of variables and spectral theory for self -adjoint operators in HUbert 

space, Informal Report, Applied Mathematics Group, Applied Physics Laboratory, The Johns 

Hopkins University, Silver Spring, Maryland. ; 

Fox, D. W. (1972) 'Lower bounds for eigenvalues with displacement of essential spectra , biam J. 

Math. Anal., 3, 617-624. . 

Fox, D. W. (1975) 'Spectral measures and separation of variables , /. Res. Nat. Bur. !>ta., ^o 

Fox! D/W. and Sigillito, V. G. (1972a) 'Lower and upper bounds to energies of radial lithium', 

Chem. Phys. Letters, 13, 85-87. 
Fox, D. W. and Sigillito, V. G. (1972b) 'Bounds for energies of radial lithium , /. Appl. Math. Phys., 

23 392—411 
Fox,D. W. and Sigillito, V. G. (1972c) 'New lower bounds for energies of radial lithium', Chem. 

Phys. Utters, 14, 583-585. ,,„•„• 

Gould, S. H. (1966) Variational Methods for Eigenvalue Problems : An Introduction to the Weinstein 

Method of Intermediate Problems, 2nd. ed., University of Toronto Press. ^ 

Kato, T. (1951a) 'Fundamental properties of Hamiltonian operators of Schrodinger type , Trans. 

Amer. Math. Soc, 70, 195-211. 
Kato, T. (195 lb) 'On the existence of solutions of the helium wave equation , Trans. Amer. Mam. 

Soc., 70, 212-218. 
Kato, T. (1966) Perturbation Theory for Linear Operators, Springer, New YorK. 
Kemble, E. C. (1958) The Fundamental Principles of Quantum Mechanics, Dover, New York. 
Kinoshita, T. (1957) 'Ground state of the helium atom', Phys. Rev., 105, 1490. 
Kuroda, S. T. (1961) 'On a generalization of the Weinstein- Aronszajn formula and the infinite 

determinant', Sci. Papers College Gen. Ed. Univ. Tokyo, 11, 1-12. 
Lowdin, P. O. (1965) 'Studies in perturbation theory XI. Lower bounds to energy eigenvalues, 

ground state, and excited states', /. Chem. Phys., 43, S175-S185. 
Lowdin, P. O. (1968) 'Studies in perturbation theory XIII. Treatment of constants of motion in 

resolvent method, partitioning technique, and perturbation theory', Intern. J. Quantum Chem., 

2,867-931. , T T _ „ 

Reid, C. E. (1972) 'Intermediate Hamiltonians for the lithium atom , Intern. J. Uuantum 

Chem., 6, 793-797. 
Stenger, W. (1968) 'On the variational principles for eigenvalues for a class ot unbounded 

operators', /. Math. Mech., 17, 641-648. 
Stenger, W. (1969) 'On perturbations of finite rank', /. Math. Anal. Appl., 23, 625-635. 
Stenger, W. (1970a) 'Some extensions and applications of the new maximum-minimum theory of 

eigenvalues', /. Math. Mech., W, 931-944. 

Stenger 291 

Stenger, W. (1970b) 'On Fichera's transformation in the method of intermediate problems', Rend. 

Accad. Naz. Lincei., 48, 302-305. 
Stenger, W. (1974) 'Intermediate problems for eigenvalues', Intern. J. Quantum Chem., 8, 

Weinberger, H. F. (1959) A Theory of Lower Bounds for Eigenvalues, Tech. Note BN-183, 

IFDAM, University of Maryland, College Park, Maryland. 
Weinstein, A. (1935) 'On a minimal problem in the theory of elasticity', /. London Math. Soc, 10, 

Weinstein, A. (1937) 'Etudes des spectres des equations aux derivees partielles de la theorie des 

plaques elastiques, Memor. Sci. Math., 88. 
Weinstein, A. (1974) 'On non-self-adjoint perturbations of finite rank', J. Math. Anal. Appl, 45, 

Weinstein, A. and Stenger, W. (1972) Methods of Intermediate Problems for Eigenvalues: Theory 

and Ramifications, Academic Press, New York. 

Position Observables of the Photon 


Physikalisches Institut der Universitat Wiirzburg, Germany 


Quantum theory was initiated by Planck's discovery of the discontinuous 
character of light emission and absorption and Einstein's subsequent 
hypothesis of light quanta. From the interference phenomena of light it was 
already apparent that these light quanta (or photons, as we call them now) 
could not be particles of the simple kind considered in classical mechanics. A 
more precise description of the 'non-classical' behaviour of particles, however, 
was first given much later by quantum mechanics. Perhaps the most impressive 
deviation from 'classical' behaviour shows up in Heisenberg's famous uncer- 
tainty relation (Heisenberg, 1927) 

&X,-AP,*% (1) 

for the components of position X and momentum P of, for example, an 

The photon itself, however, has not yet found its way into textbooks of 
quantum mechanics as an example for typical quantum properties of particles. 
Of course some simple interference or polarization experiments with light 
(which at sufficiently low intensities may be interpreted tentatively as experi- 
ments with 'single photons') are sometimes discussed in introductory text- 
books. A more detailed treatment of one-photon quantum mechanics, how- 
ever, is usually reserved for advanced texts, for example, on quantum elec- 

This neglect of the photon is perhaps partly due to the following cir- 
cumstance. It has been proved (Newton and Wigner, 1949; Wightman, 1962) 
that there is no self-adjoint (vector) operator X in the state space of the photon 
which, according to the rules of ordinary quantum mechanics, could be 
interpreted as a position observable. Usually this is taken to indicate that, 
simply, photons are not localizable at all. Accordingly an uncertainty relation 
like (1), which is so typical for massive quantum particles, could not even be 
formulated for photons (or other massless particles, e.g. neutrinos). 

* As usual, we set ft = c = 1 in our system of units. 


294 Uncertainty Principle and Foundations of Quantum Mechanics 

Quite apart from possible theoretical objections, however, the experiment 
itself seems to reject such a radical interpretation of the mentioned results. In 
fact single photons may be localized experimentally, at least above some 
energy threshold, by suitable detectors (counters, photographic emulsions, 
etc.), which moreover are very similar to the detectors for massive particles. 
Then, obviously, one has to ask how such experiments can be described 
theoretically. It is clear from what has been said before that such a description 
can be obtained only if the usual requirements for a position observable are 
somewhat relaxed. 

A first proposal in this direction was made by Jauch and Piron (1967) and 
Amrein (1969). For reasons which will be explained later, a different approach 
is preferred here, which has already been sketched elsewhere (Kraus, 1971), 
and which is based on Ludwig's reformulation of quantum theory (Ludwig, 
1970). It will be shown that, starting from a suitable generalization of the 
notion of observables as suggested by Ludwig's theory (see also Neumann, 
1971), position observables for the photon may indeed be constructed. For 
these position observables we will then prove, among other things, the validity 
of the uncertainty relation (1). 

Before discussing the localization problem, however, we will first have to 
review the usual quantum-mechanical description of single photons. This could 
be done in a most satisfactory way by starting from the representation theory of 
the Poincare (i.e. the inhomogeneous Lorentz) group (Wigner, 1939). 
Moreover, group theory would allow a unified treatment of other elementary 
particles along with the photon. Since, however, the main purpose of the 
present paper is neither mathematical elegance nor complete generality, we 
have chosen a more elementary treatment which is particularly adapted to the 
photon. The subsequent construction and discussion of a position observable 
for the photon is also very elementary in this formalism. 

The intentions of the present paper may be sketched as follows. First of all, 
we want to present the photon as just another example — perhaps a somewhat 
surprising one — for the universal validity of the celebrated uncertainty rela- 
tion (1). Secondly, a natural and useful generalization of the concept of 
quantum-mechanical observables will be illustrated by the example of photon 
position. We feel that, for both of these purposes, a fairly low level of 
mathematical sophistication and rigour is sufficient. Thus, more advanced 
mathematical techniques will be used only when (and to the extent that) they 
really help clarifying the matter and improving the presentation, and 
mathematical subtleties of a more technical type will often be omitted. 


The simplest quantum mechanical description of a single free photon is the 
following. Pure states* correspond to unit vectors in the Hilbert space #f of 

Throughout this paper we will consider pure states only. The discussion of state mixtures (density 
matrices) is irrelevant for our present investigation. 

Kraus 295 

complex square-integrable vector functions A(k) of a real vector k which satisfy 
the transversality condition 

k-A(k) = (2) 

The scalar product in #f is given by 

<A,A) = Jd 3 kA(k)-A'(k) 


with a bar denoting complex conjugation. A complete system of commuting 
observables is given by the three momentum components 

P t = k, (4) 

(multiplication operators) and the helicity (spin component in the direction of 


P k 

cr = T— : •s = — • S 
P (o 


Here <o = |k|, and the /-component s, of the spin operator s acts as the matrix 

(Sj)ki = -ieju (6) 

on vector functions A, with e ikt denoting the Levi-Civita symbol. From (5) and 
(6) the action of o- on vector functions A is easily calculated: 

(oA)(k) = /-xA(k) 



With momentum P, the energy (Hamiltonian) H is also a multiplication 

H=\P\ = \k\=w (8) 

The helicity operator a has eigenvalues ±1 in #f, with corresponding 
eigenspaces #f±. In order to show this explicitly, we choose for each k a 
right-handed orthonormal set of polarization vectors 

ei(k), e 2 (k) and e 3 (k) = ■ 


By (7), then, vector functions of the form 

A ± (k) = a ± (k)( ei (k)±/e 2 (k)) 


with arbitrary coefficients a ± (k) are eigenfunctions of a with eigenvalues ±1. 
Moreover, by (2), each Ae $f may be decomposed as 

A(k) = A + (k)+A_(k) 

with suitable A ± e %€ ± . Photon states of the form A + and A_ correspond to right 
and left circular polarized light, respectively. 

For later use a natural (but 'unphysical') extension of the state space #f will 
now be constructed. This simply amounts to dropping the condition (2), while 

296 Uncertainty Principle and Foundations of Quantum Mechanics 

the inner product (3) is left unchanged. The enlarged Hilbert space is called X. 
The definitions (4), (7) and (8) of operators P, a and H in X make sense also for 
vector functions AeX, and thus may be taken to define natural extensions P, a 
and H of these operators to the space X. In addition to (9), X also contains 
vector functions of the form 

A (k) = flo(k)e 3 (k) 


which belong to the eigenvalue zero of & and constitute the subspace X of X. 
Such functions are 'unphysical' since they do not describe photon states, and 
consequently the assignment of 'momentum', 'energy' and 'helicity' to such 
'states' by the operators P, H and & is purely formal. 

The concrete realization of the photon state space X given above has the 
advantage of providing a one-to-one correspondence between photon states 
and transverse vector functions A(k). The transformation law of such functions 
under Poincare transformations, however, looks rather complicated if written 
down explicitly. It is therefore better to describe it implicitly in terms of another 
realization of the space X. 

We consider the space 9 of complex four-vector functions 

B"(k) = {B(k),B 4 (k)} (ID 

with the additional requirements 

k v B" = k-B-<oB 4 = 

(Lorentz condition; fc 4 = o>) and 

f dfi(k)BMB"(k)<oo, d/*(k)' 

(square integrability).* With the inner product 

<B,B') = Jd/4k)iUk)B' , '(k) 

9 becomes something like a Hilbert space. In virtue of (12) the space part B of 
any B" e 9 may be decomposed according to 

d 3 k 




B = w i X+( 

k „\ k 
— B - = 
w /ft> 

w k \+B 4 - 


into a transverse part ft>*A (i.e. k-A = 0) and a longitudinal component 
(k/w) • B=B 4 . Then (14) immediately yields 

CB,B') = jd 3 kA(k)-A'(k) 


*Notation for four-vectors a" = {a,a 4 }, i» = 1...4: a v = a" for v = \,2, 3; a 4 --a ; aj>" - 
a • b— a 4 b* (sum convention). 

Kraus 297 

(The factor w * has been introduced in (15) since we want to have d 3 k, instead of 
d/u.(k), in (16).) From (16), the positivity of the inner product (14) follows. 
Moreover, since only the transverse part of B enters (16), we see that all 
four- vector functions B" which differ only with respect to (k/&>) -B = B 4 
represent the same vector in the Hilbert space defined by the inner product 
(14). In other words: Functions B" e ^with vanishing transverse part w A of B 
are zero vectors with respect to the inner product (14). Such functions 
constitute a subspace 9 of 9. Then, not 9 itself, but the space 9/9 of 
equivalence classes in 9 with respect to 9 , is a Hilbert space. Such an 
equivalence class consists of all B" of the form 

B" = {«*A+£ 4 £, S 4 } 


with a given transverse A and arbitrary B 4 . 

By (3) and (16), the Hilbert spaces X and 9/9 may be identified in an 
obvious way, as already indicated by the use of the same symbol A in both 
cases. Thus 9/9 is also a realization of the photon state space. (This 
realization is formally analogous to the Fermi gauge in quantum elec- 
trodynamics, whereas the former one corresponds to the Coulomb gauge. 
There also exists a description of one-photon states which corresponds to the 
Gupta-Bleuler gauge, but — contrary to what happens in quantum 
electrodynamics — this gauge is not very useful here.) 

A certain disadvantage of the new formalism in the fact that, due to the 
arbitrariness of B 4 in (17), the correspondence between photon states and 
four- vector functions B"(k) is no longer one-to-one. (In fact, B"(k) and 

B' v (k) = B 1 '(k) + k''x(k) 

with arbitrary *(k), represent the same photon state. This, obviously, corres- 
ponds to a certain class of gauge transformations in classical electrodynamics.) 
This disadvantage is more than compensated, however, by the simple (four- 
vector) transformation law of B" e 9 under Poincar6 transformations. For a 
Poincar6 transformation consisting of a homogeneous orthochronous* Lorentz 
transformation A (with matrix A!^, v, fi = 1 ... 4) and a subsequent four- 
translation a (with components a", v = 1 . . . 4), this transformation law is 

B"(k)^{U(a, A)B)"(k) = e^-A^^A^k) 


(Here A _1 k is the space part of the four- vector resulting f rom k " = {k, w } by the 
Lorentz transformation A -1 ). As easily shown, equation (18) defines a rep- 
resentation of the Poincar6 group on 9. From (18) and the Lorentz invariance 

dM(k) = d/i(A _1 k) 

of the measure d/u,(k), the invariance under (18) of the inner product (14) in 9 

*The behaviour of B" 

under Lorentz transformations with time reversal looks somewhat more 

298 Uncertainty Principle and Foundations of Quantum Mechanics 

is easily proved. In particular, U(a, A) transforms the space 9 into itself. 
Therefore it may also be interpreted as a transformation of the equivalence 
classes (17), and is unitary in 919*. Since 9/9 = X, finally, U(a, A) also yields 
a unitary transformation law for the transverse vector functions A(k) in our 
previous formalism. The explicit calculation of this transformation law is 
straightforward but unnecessary since we will not need it here. [For a particular 
case see equation (39).] 

We conclude this Section with an elementary investigation of how helicity 
behaves under Poincare transformations. The helicity operator a may be 
transferred to 9 by defining, for any 

B"={B,B*} = {a> k \+*B 4 ,B*}e9 
(o-BY = {i-x^A, OJ = j/^xB, o} 


It is obvious that this operator a in 9 induces a transformation of 9/9 = #f 
which coincides with the operator a in PC previously defined by (7). By (19) the 
equivalence class of a given B " e 9 describes a photon state of helicity ± 1 if and 
only if 

i'-xB = ±w s A 


-H-;* 4 ) 

or, with <o = k 

j(kxB) = ±(fc 4 B-B 4 k) 

The last equation can be rewritten as 

ie KXu Jc»B" = ±(k K B x -k x B K ) 


with the Levi-Civita symbol e K ^„. The invariance of (20) under pure space- 
time translations follows trivially from (18). According to (18), both k v and B" 
behave like four-vectors under pure (orthochronous) Lorentz transforma- 
tions* whereas, as well-known, e KXtLV is a pseudotensor. Thus equation (20) is 
also invariant under proper orthochronous Lorentz transformations, but under 
space reflection the left-hand side changes sign. Therefore the helicity eigen- 
spaces % ± of W are invariant under proper Lorentz transformations, whereas 
space reflection interchanges #f+ and #?_. 

Many elementary particles (e.g. electron, proton, neutrinos) may be charac- 
terized by the fact that their state space carries an irreducible representation of 
the proper Poincare group. The state of such a particle is uniquely determined 
by the expectation values of all 'kinematic' observables, i.e. all infinitesimal 
generators of Poincar6 transformations (energy, momentum, angular momen- 
tum, etc.). This is not so for the photon, since the helicity eigenspaces #f+ and 

* I.e. for B'" = (U(Q, A)B)" we have B'"(k') = A^fl*(k) with k'" = A^fc*. 

Kraus 299 

#?_ reduce the representation U(a, A) (for A proper). With respect to 'kinema- 
tic' observables, therefore, a coherent superposition 

aA+ + 0A_ 


of normalized states A ± e #f ± (with \a \ 2 + \fi | 2 = 1) cannot be distinguished from 
the incoherent mixture of these states with the weights |<*| 2 and |/S| 2 , respec- 
tively. However, there are observables which permit such a distinction. For 
instance, suitable states of the form (21) (with |a| 2 = |/3| 2 = §) correspond to 
linear polarization, whereas the corresponding mixtures describe totally 
unpolarized light. Measurements of linear polarization thus do not belong to 
the 'kinematic' observables of the photon. The position observable to be 
constructed in the subsequent section will be another example for observables 
of this 'non-kinematic' type. 


In order to be acceptable as a photon position operator* in the sense of usual 
quantum mechanics, a (vector) operator X on the photon state space #f has to 
satisfy two requirements. First, its components X t have to be self-adjoint, and 
have to commute with each other in order to be measurable together. Secondly, 
the behaviour of X under spatial rotations and translations (i.e. Euclidean 
transformations) is prescribed to be 

U*(a,R)XU(*,R) = RX+a (22) 

Here (/(a, R) is the restriction of the Poincare group representation (18) to the 
Euclidean group, the elements of which consist of space rotations (or reflec- 
tions) R and subsequent translations a. Equation (22) is equivalent to the 
self-evident requirement 

= *<X> A +« (23) 


t/(a,R)A ' 

for the expectation values 

<X) A = (A,XA) 


of X in arbitrary states A. 

Another way of describing position measurements is as follows. Spatial 
localizability of the photon implies the existence of observables E(A), corres- 
ponding to largely arbitrary space regions A,t which take the value one 
(respectively zero) if at time t = the photon is found (respectively not found) 
inside the region A. Measurements of such E(A) should be actually feasible, at 
least for certain regions A, by means of suitable counters. According to the 

Throughout this paper we will use the Heisenberg picture. The position observables to be 
discussed thus refer to position measurements at a fixed time, t = say. (For the conserved 
quantities considered before, such specification of time was unnecessary.) 
tPrecisely: to all Borel sets A. 

300 Uncertainty Principle and Foundations of Quantum Mechanics 

rules of ordinary quantum mechanics, these 'yes-no' observables have to be 
represented by projection operators on #f (which, for simplicity, are also called 
E(A)), and the probability that at time t = a photon in state A 'triggers the 
counter E(A) in the region A' is 

w A (A) = <A,i?(A)A> (25) 

This physical interpretation of E(A) immediately implies, with denoting the 
empty set and R 3 denoting all of space, 

E(0) = O, £( R 3 ) = 1 | (26) 

e(\J A,) =1 E(A t ) if A, n A ; = for i * j 
(The last relation follows from the additivity 

of the probabilities (25) for mutually disjoint regions A,.) A correspondence 
A-*E(A) of space regions A and projection operators E(A) with the properties 
(26) is called a spectral measure on R 3 . Equations (26) imply that any two E(A) 
commute with each other, and that 

H(nA i )=n£(A,) 

E(A 1 vA 2 ) = E(A 1 )+E(A 2 )-E(A 1 nA 2 ) 

E(A') = 1-E(A) 

with A' denoting the complement of A. 
The requirement corresponding to (23) is now 

w x (A) = w UiayR)A (A a , R ) 



A, jR =i?A+a = {x|x = i?y+a,yeA} 

(i.e. the region obtained from A by the rotation R and translation a), and is also 
self-explanatory. Since the state A in (28) is arbitrary, this condition is 
equivalent to • 

U(a,R)E{A)U*(*,R) = E(A a , R ) (29) 

A spectral measure on #f which satisfies (29) is called Euclidean covariant with 
respect to the given representation [/(a, R), or simply: covariant. 

The equivalence of the two descriptions of position measurements follows 
from the fact that one may construct X if the E(A) are given, and vice versa. 
Assume first the spectral measure E(A) to be given. From the physical 
interpretation (5) of <A,^(A)A> we conclude that, with d£(X)=.E(d 3 x), 

Kraus 301 

} x,(A, d£(x)A) is the expectation value* of the y'th photon coordinate in state 
A. In order to represent this as (A, X, A) with the component X f of a position 
operator X, we have to take 

Kj=[ Xj dE(x) (30) 


This formula, indeed, defines three self-adjoint operators X h and provides a 
common spectral representation of them. The more f amiliar spectral represen- 

A}= [AdE y (A) 


with one-dimensional spectral families J5 ; (A) follow from (30) if we define 

£,(A)=.E(A M ), A /A ={x|^<A} 

Therefore the operators Xj commute with each other in the sense that 

) [ J B / (A),£' Jt (/i)] = for all/, k, A and fi (32) 

which is somewhat stronger than 'naive' commutativity, i.e. 

[X h X k ] = (33) 

(on the dense domain where the left-hand side exists). Vice versa, any three 
self -adjoint operators X t which commute in the sense of (32) possess a common 
spectral representation of the form (30). The projection operators E(A) of the 
corresponding spectral measure may be calculated explicitly as 

£(A) = | dE(x) = \xA(*)dE(x) 


with the characteristic function 

Aa(x) = 

1 forxeA 

of the region A. The last expression in (34) is simply the operator function ^a(X) 
of X, in accordance with the physical meaning of E(A). Finally the covariance 
requirements (22) for X and (29) for E(A) may also be shown to be equivalent. 

The result of Newton and Wigner (1949) and Wightman (1962) is, simply, 
that the photon does not possess a position observable with the required 
properties. [The non-existence of an operator X was first proved by Newton 
and Wigner (1949) who, however, needed some additional assumptions for 
their proof . Later on Wightman (1962) was able to prove the non-existence of a 
covariant spectral measure 2? (A) without any additional requirements.] We do 
not want to reproduce these proofs here, but will start instead with a naive 
attempt to construct a photon position operator X explicitly. The failure of this 

*This makes sense as a Stieltjes integral. 

302 Uncertainty Principle and Foundations of Quantum Mechanics 

attempt will then illustrate the 'no-go' theorem of Newton, Wigner and 
Wightman. Besides this, however, a suitable refinement of this construction 
will lead us directly to a (generalized) position observable of the photon. 

By (3) and (4), A(k) may be interpreted as the photon wave function in the 
momentum representation.* In analogy to ordinary quantum mechanics, we 
thus attempt to define a position operator X by 

(A,A)(k) = /— A(k) 


However, if applied to Ae X these operators ft, destroy the transversality (2) 
since, in general 

k -^ A =''ir( k - A >- /A '- (36) 

Bkj dkj 

is not zero if k • A = 0. This difficulty is absent if we read (35) as defining 
operators ft, on the larger Hilbert space ft introduced in Section 2. Equation 
(35) then defines three self-adjoint X, on ft, which commute with each other in 
the sense of (32). In fact, ft, acts as multiplication by x, on the position space 
wave functions A(x) obtained from A(k) by Fourier transformation, and the 
spectral projections £(A) of X then correspond to multiplication by the 
characteristic functions * A (x) of the regions A. The difficulty indicated by (36) 
may now be circumvented as follows. Denoting by <D the projection operator 
which projects % onto its physical subspace 3ft and by 0\ x the restriction to 3t 
of an operator 6 acting on ft, we define operators X, on 3€ byt 

X, = <S>X,\ X (37) 

This definition is chosen such that, for Ae 3t, 

(A,X / A) = (A,i' / A) (38) 

In this sense the operators X, on X are substitutes for the ft, which lead out of 
X. Since ft, is self-adjoint, (38) implies that <A, X,\) is real; therefore the X, 
are at least symmetric. (We claim that they are even self-adjoint, but since this 
property is unessential here we did not try to prove it.) 

A little detour is appropriate if we want to discuss Euclidean transformations 
of these operators X,. The transformation law of state functions A(k) e 3t under 
Euclidean transformations follows from (18) as 

(£/(«, i?)A)(k) = e-' k -U AClT'k) (39) 

As expected, A behaves as a vector under space rotations. This transformation 
law may be extended quite naturally to the enlarged Hilbert space X by taking 

({/(a, i?)A)(k) = e-' k -* A0R -1 k) (40) 

♦Namely, A(k) • A(k) is the probability density in momentum space correspondin| to a (nor- 
malized) state Ae X. For this it is essential that the inner product (3) is denned with d k instead of 

tit is easily proved that (37) yields operators whose domain of definition is dense in X. 

Kraus 303 

for A € ft as well. [For clarity of notation we have used the symbol {/(a, R) for 
the extension of U(a,R) to ft. This extension is related to — and consistent 
with — our previous extensions of momentum and helicity operators to ft; in 
fact, the latter may be expressed in terms of infinitesimal generators of 
U(a, R).] Since under U(u, R) both k and A transform as vectors, the longitud- 
inal part Ao = (k/o))[(k/(o) • A] and the transverse part A tr = A-Ao of an 
arbitrary Ae ft transform separately under U(a, R). This implies 

[t/(a,/?),«5] = 0, U(»,R)\ x =U(a,R) 


i.e. <I> reduces U(a, R), and the subrepresentation of £/(a, R) in %t is U(a, R). 
A straightforward calculation with (35) and (40) yields 

U*(a, R)XU(a, R) = RX+a 


as to be expected from the vector character of X = iV k . Together with (41) this 
immediately leads to a transformation law of the desired form (22) for the 
operator X defined by (37). 

However, we know from Newton and Wigner (1949) and Wightman (1962) 
that our construction of a photon position operator has to fail somewhere. This 
failure is indeed easily seen. With (35), (36) and (37) we obtain explicitly, for 

From this we find by a trivial calculation that the commutativity condition (33) 
is violated by our X. In the rest of this Section we will try to show that X, in spite 
of not being an ordinary photon position operator, nevertheless may have 
something to do with photon position. 

As already mentioned, the operator X-on ft has self-adjoint and mutually 
commuting components. By (42) it also satisfies the transformation law 
required for a position operator. Therefore the spectral measure E(A) 
associated with X is covariant with respect to U(a,R). Any difficulties 
associated with the photon position operator would thus be absent if, instead of 
3ft, the enlarged Hilbert space ft were the physical state space of the photon. 
This suggests the following tentative description of position measurements for 
photons. We consider X on ft, or the corresponding spectral measure E(A), as 
operators representing the photon position, which allows us to satisfy the usual 
requirements at least formally. We deviate from ordinary quantum mechanics, 
however, to the extent that not the whole Hilbert space ft but only the 
subspace 3t of it is interpreted as the state space of the photon. Accordingly we 
interpret, for physical states A e %t (and only for them), 

(X) A = <A,XA) 
as expectation value of the photon position, and 

h> a (A) = <A,£(A)A> 



304 Uncertainty Principle and Foundations oi Quantum Mechanics 

as probability for finding the photon in the space region A. These definitions 
satisfy the covariance requirements (23) and (28), as easily checked: 

<XW,k)a = (U(a, R)\, X£/(a, R)A) 
= (U(a,R)A,XU(a,R)A.) 
= <A,(J?X+a)A) = J R<X) A +a 

by (41) and (42); similarly, (28) follows from (41) and the covariance of £ (A). 

The unphysical Hilbert space $ can be eliminated altogether from this 
description. By (38), equation (44) may also be written as 

(X) A = (A,XA> 
with X defined by (37), and (45) may be reformulated as 

w A (A) = <A,F(A)A> 
with operators F(A) on X defined by 

F(A) = $F(A)U 
From 0< w A (A) < 1 for all normalized states A we get 
F(A)* = F(A), 0<F(A)<1 





or in words: all F(A) are self-adjoint, non-negative and bounded in norm by 
one. They are in general not projection operators, except for particular regions 
A like and IR^ (see below). This follows from a simple mathematical result: 

For two projection operators E x and E 2 , ) 
ExE^Ex is a projection operator if and only > 

if E x and E 2 commute* 


Assume all F(A) to be projection operators, which by (48) means that all 
$F(A)<J> are projection operators on £ Thus, by (50), $ commutes with all 
F(A), and therefore also with X, = J x L dE(x). This, however, is a contradiction 
since, by (36), there are Ae X with X^fX. 

The spectral measure F(A) satisfies relations of the form (26) and (27). 
Together with (48) this leads to similar relations for the operators F(A). We 

obtain from (26) 

F(0) = O, F(M 3 ) = r, 

F(UA i )=IF(A 1 ) if A,nA y = for iV/ 

* Proof- Let F = E l E 2 Ei be a projection operator. Then F 2 = E l E 2 E 1 E 2 Ei = F. This implies, for 
A = E 2 E 1 -E i E 2 E 1 , that A* A = 0, and thus = A = A* = A*-A =[E 1 ,E 2 \. The converse ts 
well known. \ 

and from (27) — or directly from (5 1) — 

F(A X u A 2 ) = F(A0 +F(A 2 ) -F(Ai n A 2 ) 
F(A') = 1-F(A) 

Kraus 305 


The first relation of (27), however, has no simple analogue for the operators 
F(A). We also cannot conclude from (5 1) that the operators F(A) commute with 
each other.* The physical interpretation of (51) and (52) in terms of the 
probabilities (47) is obvious. Any correspondence A-»F{A) of space regions A 
and operators F(A) with the properties (49) and (51) is called here, as usual, a 
POV (positive operator valued) measure on R 3 . Our POV measure F(A) 
satisfies the additional condition 

U(a,R)F(A)U*(a,R) = F(A a , R ) 


and is therefore called Euclidean co variant with respect to U(a, R). Equation 
(53) follows immediately, since the condition (28) is satisfied for w A (A) as given 
by (47), with A arbitrary. A (covariant) spectral measure, clearly, is a particular 
case of a (covariant) POV measure, distinguished by the additional property 
that(F(A)) 2 = F(A)forallA. 

With this terminology, the formalism proposed here may be characterized by 
the fact that it describes the localization probabilities h> a (A), via (47), in terms 
of a covariant POV measure F(A) instead of, as usual, a covariant spectral 
measure. The 'position operator' X introduced in addition is already deter- 
mined by F(A). From X = \ x dF(x) we obtain, for A e W and dF(x) = F(d 3 x) = 

-XA = <DXA = * J x dF(x)A 

or, shortly, 

and thus 

= [ x<D dF(x)A = [ x dF(x)A 

X = |xdF(x) 


[For a physical interpretation of (55) compare the discussion of (30). As shown 
by (54), the POV measure F(A) provides a substitute for the non-existent 
common spectral representation of the three components X f of X.] The 
converse, however, is not true: There is no general procedure for reconstruct- 
ing the POV measure F(A) from the operator X related to it by (54) (unless 
F(A) is known to be a spectral measure, which case was discussed above). This 
is due to the fact that a given operator X may have several representations of 

*In fact they do not commute, for otherwise equation (54) below would imply commutativity of the 
components X f of X. 

306 Uncertainty Principle and Foundations of Quantum Mechanics 

the form (54) with different POV measures F(A), as will be shown by means of 
an example in Section 6. 

Moreover, as compared to the case of an 'ordinary' position operator (i.e. 
one belonging to a spectral measure), the knowledge of X is also less useful here 
from a physical point of view. Whereas in both cases the expectation values of 
position in arbitrary states A may be calculated in terms of X as (A, XA), the 
mean square deviations A A AT y are given by the familiar formula 

(A A A}) 2 = ||(X, - <X y > A )A|| 2 = II*, A|| 2 " «*/>a) 2 (56) 

for an 'ordinary' position operator X only. For our position observable given by 
the POV measure F(A), the physical meaning of <A, dF(x)A) implies 


JC,f = j (x, - * ; ) 2 <A, dF(x) A), x, - (X f h 


With (48) we get from this 

(A A X ; .) 2 = | (*, -x,) 2 <A, dE(x)A) 

= ||(^.-<^) A )A|| 2 = ||X / A|| 2 -«^) A ) 2 


which does not reduce to (56) since, in general, A} A * X,A. A formal descrip- 
tion of position measurements for a photon in terms of the 'position operator' 
X is thus incomplete, in contrast to the description in terms of the POV 
measure F(A). On the other hand,F(A) is fixed uniquely by the operator X on 
the extended Hilbert space, since X uniquely determines F(A). As exemplified 
by (44) and (58), important physical quantities may also be calculated directly 

in terms of X. 

From the point of view of quantum mechanics in its usual form, our 
description of position measurements in terms of a POV measure F(A) looks at 
least rather unconventional. Moreover, the explicit construction of F(A) 
described above is quite heuristic. Before looking for a better theoretical 
justification of the formalism, however, we will first derive some physical 
consequences from it. If these consequences look reasonable, this may perhaps 
help to strengthen the subsequent, more theoretical arguments in favour of our 


We start from Schwarz's inequality 

||A'|| • ||A"|| > |<A\ A")| > |lm <A, A">| = i|<A', A") - <A", A')| (59) 

for arbitrary vectors A' and A" in $t From this we obtain, for a normalized state 

Kraus 307 

vector A e #f in the domain of definition of both X t and P, (or, equivalently, of 
both Xj and Pj), the estimate 

||(Ai - x,) A|| • ||(P y -p,)A|| s \\(X A, P, A) - {Pi A, X, A>| (60) 

with *,=<A,> A , py = <P y > A . (Take A' = (Ai-Jf,)A and A" = (P y -p,)A = 
(Pj -p ; )A, and note that the terms with x, and p t cancel in (A', A")- (A", A').) 
According to (58) and the analogue of (56) for the 'ordinary' observables P h the 
left-hand side of (60) is equal to A A X t • A A F ; . A simple calculation, using the 
explicit definitions (35) of X and (4) of P„ yields 

(XB, P ; B') - (P f B, XB') = iS u (B, B'> (61) 

for arbitrary vectors B and B' in it belonging to the domain of definition of both 
X and Pj. An alternative, more abstract proof of (61) uses the transformation 
property (42) of X and the fact that P, is the self-adjoint generator of 
translation along the /th axis, i.e. 

e ikp iX e~ ikp i =X+ ^ (62) 

as follows: 

(XB, e ,A ^B') = (B, X e' A/5 'B'> = <B, e Ap i(X - A5 (; )B') 
= <e-' A ^B, XB')- ASijiB, e' A# 'B') 
by (62), and thus 


= - -£r((XB, e'^B')-<e-''^B, AiB'))| A=0 
i dA 

= ^(A5 iy <B, e' Ai> 'B'»| A=0 = iS^B, B') 

From (60) and (61) we obtain Heisenberg's position-momentum uncertainty 

a a a;.a a p,>&, 


for all normalized photon states A of the type specified above. As apparent 
from (58) for A A A), and from the analog of (56) for A A F /( such states are the 
only ones for which both A A A, and A A F ; are finite. For all other states, 
therefore, (63) is satisfied in a trivial way. 

Some readers might find our derivation of (63) rather pedantic. They might 
feel it would be easier to use the commutation relation 

[a;, />.] = /$, 

for an evaluation of the right-hand side of (60) in the form 

(X A, P, A) - & A, XA) = <A, [X, Pj]A.) = iS u 


308 Uncertainty Principle and Foundations of Quantum Mechanics 

This short-cut calculation, clearly, is not perfectly rigorous. That it may even 
lead to wrong physical conclusions is explained elsewhere (Kraus, 1970). 

As mentioned before (see footnote at beginning of Section 3), the POV 
measure F(A) describes position measurements at time t = in a given inertial 
frame. Therefore the operators U(a, A)F(A)C/*(a, A) describe position meas- 
urements at time t ' = in a 'primed' inertial frame, generated from the original 
'unprimed' one by the Poincare transformation (a, A). Of particular interest is 
the case where (a, A) is a pure time translation by the amount t, which leads to 

the POV measure 

F,(A) = C/(0F(A)[/*(,) | (w) 

U(t) = U({0,t},l) = e' H ') 

As the corresponding measurements, obviously, may also be interpreted as 
position measurements at time t in the original inertial frame, equation (64) is 
nothing but the familiar time dependence of the observables F,(A) in the 
Heisenberg picture. 
The same transformation F(A)-»F,(A) is also obtained from 
F,(A) = U(t)E(A)U*(t)) 
F r (A) = *£,(A)|* i 

with a unitary operator U{t ) on $C satisfying 

[{7(f), <*>] = (), U(t)\ x =U(t) (66) 

In fact, (65) and (66) imply 

F(A) = *4»(A)|w = ®U{t)E{A)U*{t)\ x 

= U{t)<S>E{A)\ x U*{t) 

= U(t)F(£)U*(t) 

in accordance with (64). The condition (66) is satisfied, for example, by 

U(t) = t i " t (67) 

with the 'natural' extension H of the Hamiltonian H described in Section 2. Of 
course, (66) has many other solutions besides (67), but this particular one is 
very convenient since it permits the explicit calculation of the self-adjoint 

X, = U(t)XU*(t) (68) 

which corresponds to the spectral measure F r (A). Indeed, a simple calculation 
using the explicit expressions 

X = /V k , U(t) = e ia " 


X, = X+Vf (69) 

with the (vector) multiplication operator 

H co 

Kraus 309 


on $£. Implicitly equation (69) contains the complete solution of the Heisen- 
berg equation of ^motion (64) for F,(A), since X, uniquely determines its 
spectral measure F,(A) which in turn, via (65), yields F,(A). This does not mean 
that (69) is really helpful for the explicit calculation of F,(A)'s. However, in 
most cases one is satisfied with a much less detailed description of position 
measurements at different times, and in such cases (69) may be applied directly. 
Consider, for instance, the time-dependent expectation value (X t ) A of 
position in a given state A. We obtain 



<X,) A = <A, X, A) = <A, XA) + <A, \X)t 
= (A, X, A) = <A, XA) + (A, \A)t 

X, = ®X t \ 9); = X+\t 

P k 

H co 




a multiplication operator on $? which, obviously, has to be interpreted as the 
photon velocity operator. Its components V} satisfy the relations 

-1<V;<1, \y\ = (V\+Vl+Vl) k = l (74) 

as to be expected from this interpretation.* As a k space average with weight 
function A(k) • A(k) of the unit vectors k/w, the vector (V) A = (A, VA) (with 
components ( Vj) x = (A, VjA)) has a length |(V) A | smaller than one.t Thus (71) 

£<X,) A = <V> A , |<V> A |<1 (75) 


i.e. the time-dependent average photon positions (X t ) A lie on a straight 
time-like worldline. 

For the time dependence of the mean square deviation of the /th photon 
coordinate, an obvious generalization of (58) together with (69) yields 


A A X y , = IK*,, - x it )A || = %#, - £,) A +(Vj- v,)t A|| 
Xj = (X}> A , v, = < V,) x , x jt = (X jt ) x = Xj + Vjt 

Note that in our units the light velocity is equal to one. 

tThe value one is excluded since, for normalized A, the weight function A(k) • A(k) cannot be a 
delta function. 

310 Uncertainty Principle and Foundations of Quantum Mechanics 

and thus 

A A X„ =£ ||tf, -x,)A|| + \t\ ||( V, - »,)A|| 

||(x 7 .-^.)aII = AaX 7 . 


||(V;.-tJ / )A|| 2 = ||(^-iJ,)A|| 2 = (A A ^) 2 
= ||V>i|| 2 -tJ^||VAA|| 2 <l 
in virtue of (74).* Therefore we obtain the estimates 

A A *,,<A A x,+A A v;.k|, a a v;<i (76) 

which show that the growth in time of the mean square deviations of photon 
coordinates (or, if, translated into the Schrodinger picture, of the widths of a 
wave packet in the directions of the three coordinate axes) is also restricted by 
the velocity of light. By using rotational invariance [or a simple generalization 
of (58), compare equation (79)], a similar estimate may be derived for the width 
A A (e • X r ) of the wave packet A in the direction of an arbitrary unit vector e. 

The relations (75) and (76) express a certain kind of causal behaviour of 
position measurements. If, for instance, there were wave packets A with 
average positions <X,) A moving faster than light, this would hardly be consis- 
tent with relativistic causality, since one could easily imagine the use of such 
wave packets as faster-than-light signals. Likewise, the existence of wave 
packets with average widths A A X, r growing with superluminar velocity would 
look suspect from the point of view of causality, although it is not at all obvious 
how such wave packets could be used to exchange signals between space-like 
separated observers. 

Another causality requirement for successive position measurements would 
be the following: If, in some state A, the photon position at time t = is 
certainly inside a region A, then at any time t* the photon has to be with 
certainty in the region 



A, = {x||x-y|<fforallyeA} 

<F, F(A)A) = 1 implies <A, F,(A,)A> = 1 for all t 


As the corresponding property for classical particles is obvious, the postulate 
(78) seems to be well-founded, too. However, (78) is simply wrong, and is 
moreover wrong not only for photons and the POV measure F(A) considered 
here but, quite generally, for all relativistic elementary particles and for all 
conceivable position observables. More precisely, it turns out that (A, F,(A,)A> 

♦Namely, (74) implies j| V,|| = 1, and thus ||V,A||=s||A|| = 1. 1| V,A|P = (A, VJA) = 1 would mean that 
Aisaneigenstateof Vf of eigenvalue one, but the spectrum of V] = k j /a> z is purely continuous. 

Kraus 311 

is strictly less than one for all states A with (A, F(A)A) = 1 and all times t¥^ 0. 
In this generality, the result is due to Hegerfeldt (1974); we refer to his paper 
for the (surprisingly simple) proof. Of course, the requirement (78) is non- 
trivial only if there are at least some regions A and corresponding states A with 
(A, F(A)A) = 1. In our present case, however, it is easily seen that such states A 
indeed exist for an arbitrarily given region A.* 

On the other hand, it is clear from (75) that only the 'tail' of the wave packet 
can be outside of A, at time t. Indeed, its centre (X r ) A is certainly in A, for all t 
since, at least for convex regions A, it is in A for t = 0. A similar intuitive picture 
of the spreading of wave packets is suggested by the following estimate. With 
x = (X) A and an arbitrary unit vector e, we define by 

(5(e, t)f = | (e • (x-x)) 2 <A, dF,(x)A> 

a measure 5(e, t) for the average spatial distance, in the direction of e, of the 
wave packet at time t from its centre x at time t = 0. Since 

S(e,/) = ||(e-(X,-x))A|| 

(compare the derivation of (58)), we find from (69): 

fi(e,Os||(e-Cfc-i))A|| + M||(e.V)A|| 


A A (e • X) = ||(e • (X-x))A|| - S(e, 0) 


clearly, is the width of the wave packet, measured along the direction of e, at 
time t = 0, whereas 

by (74). Thus, finally, 

(e - V) A|| = ||(e • V) A|| s ||A|| = 1 

5(e,r)<A A (e-X) + M 


The physical meaning of this estimate is obvious. Estimates analogous to (75), 
(76) and (80) may also be derived, e.g., for the Newton-Wigner position 
operator of a particle with mass m > 0. [In this case the velocity operator is 
P/// = P(P 2 +m 2 r 5 .] 

We feel that the violation of (78) should not be taken to indicate an 
'acausality' until it is proved that its violation indeed makes possible, at least in 
principle, the exchange of faster-than-light signals. It seems not implausible at 
least that the impossibility of such signals already follows from estimates like 
(75) or (80). This problem, clearly, should be investigated further. In any case, 
however, the violation of (78) is a nice example of how misleading the classical 
particle picture may be in quantum mechanics. 

*Since^ i?(A) acts as multiplication by * A (x) on the Fourier transform A(x) of A(k), we 
have £(A)A = A, and thus (A, F(A)A> = 1, if A(x) - outside of A. 

312 Uncertainty Principle and Foundations of Quantum Mechanics 

The simplest measurements are those for which only two different outcomes 
'yes' and 'no', or one and zero, are possible. In usual quantum mechanics these 
'yes-no' observables are represented by projection operators E on the state 
space X, such that <A, FA) is the probability for the outcome one (i.e., 'yes') 
if E is measured in the normalized state A € X. If applied to particle detectors, 
this formalism immediately leads to the description of particle position by a 
covariant spectral measure F(A). However, if quantum mechanical yes-no 
measurements are investigated more closely [e.g. by considering suitable 
models (Kraus, 1971 and 1974)], one realizes that they do not in general 
correspond to projection operators. Instead, a general yes-no measurement 
has to be described by an operator FonX with 

F*=F, 0<F<1 (81) 

the probability for the outcome 'yes' in state A being given again by (A, FA). 
The projection operators E are a very particular class of such operators F, and 
in fact any practically performable yes-no measurement most likely does not 
correspond to a projection operator. This observation may serve as the starting 
point for Ludwig's axiomatic reformulation of quantum theory (Ludwig, 
1970). General yes-no experiments are called 'effects' in this theory, whereas 
the particular ones corresponding to projection operators are denoted as 

'decision effects'. 

The corresponding generalization of the notion of an observable is almost 
obvious. Usually a quantum-mechanical observable is taken to correspond to a 
self-adjoint operator X on X, and the spectral measure F(A) on the real line 
obtained from the spectral representation 


x dE(x) 


is interpreted as follows: For a given interval A, F(A) corresponds to the yes-no 
observable which takes the value one (respectively zero) if for the original 
observable a value x in (respectively outside of) A is measured. The generaliza- 
tion consists of admitting observables for which a general POV measure F(A) 
on the real line takes the r61e of F(A), with the same physical interpretation.* 
From this point of view, then, the use of a covariant POV measure F(A) on Or 
for the description of photon detectors looks quite natural.t The properties of 
such generalized observables have been illustrated in Section 4 by the example 
of the photon coordinates X,. In particular, we have seen that the associated 

*In particular, we do not interpret POV measures as describing inaccurate ('fuzzy') measurements, 
as done by Ali and Emch (1974). _ > . 

tThe operators F(A) are interpreted here as describing 'exact' photon positions, i.e. a click: in the 
'counter' corresponding to F(A) is taken to indicate that the photon is really inside A. We feel that 
one could speak meaningfully of inaccurate position measurements only if there were another, 
'more exact' position observable to compare with. 

Kraus 313 


X= xdF(x) 


does not completely describe a generalized observable.* 

One further point, however, is worth mentioning here. By their very 
definition, any two effects F(A X ) and F(A 2 ) belonging to the POV measure of a 
generalized observable can be measured together (e.g. simply by determining 
the value of the observable with sufficient precision). In Ludwig's terminology, 
such effects are called 'coexistent'. If one is dealing with a spectral measure 
F(A), the well-known necessary and sufficient condition for the 'coexistence' 
(usually called 'commensurability' in this case) of F(A X ) and F(A 2 ) is com- 
mutativity, which is indeed satisfied for any spectral measure (cf. Section 3). 
More generally, two effects F x and F 2 are coexistent if and only if there exist 
three effects Fi, F 2 and F 3 such that 

F 1 =F\+F 3t F 2 = F' 2 +F 3 , FJ+F 2 +F 3 <1 


(See Ludwig (1970), or Kraus (1974) for a more elementary discussion.) 
Commutativity of Fi and F 2 is sufficient to guarantee the validity of (84), but is 
necessary only if Fi or F 2 or both are projection operators. For F x = F(A t ) and 
F 2 = F(A 2 ) belonging to a POV measure, (84) is satisfied in virtue of the 
measure property (51), which implies 

F(A t ) = F(Ax n A 2 ) +F(Ai n A 2 ) 

F(A 2 ) = F(A 2 n Ai) + F( A x n A 2 ) 


F(A X n A 2 ) +F(A 2 n Ai) +F(Aj n A 2 ) 

= F((A X n A 2 ) u (A 2 n Ai) u (A x n A 2 )) = F(A X u A 2 ) < 1 

The attempt of Jauch and Piron (1967) and Amrein (1969) of constructing a 
photon position observable is closely related to the one discussed here. These 
authors, however, insist on the description of yes-no observables by projection 
operators, and therefore do not accept F(A) as describing a photon counter in 
the space region A. They take instead, for this purpose, the projection operator 
F'(A) onto the subspace of eigenvectors of F(A) belonging to the eigenvalue 
one. An equivalent definition is E'(A) = $nE(A)\ x , with the projection 
operator $n£(A) onto the intersection of the subspaces X = ®X and E(A)X 
of X. The covariance condition (29) and the first two relations of (26) are easily 
seen to be satisfied for the operators F'(A). The third (additivity) condition of 
(26) has thus to be violated, since otherwise £"(A) would be a covariant spectral 
measure. Because this additivity condition has a direct physical interpretation, 

*There are even POV measures F(A) for which (83) makes sense only if applied to the zero vector, 
so that there is no operator X at all. However, such 'observables' are pathological also from the 
physical point of view. 

314 Uncertainty Principle and Foundations of Quantum Mechanics 

we consider its violation as a serious disadvantage. Moreover, there are certain 
pairs of regions for which the corresponding operators F'(A) do not commute, 
and thus do not describe commensurable measurements. 


At first sight the method used in Section 3 for constructing the Euclidean 
covariant POV measure F(A) for the photon might look somewhat fortuitous. 
This is not the case, however, as the following discussion shows. We start with 

Theorem 1: , 

(1). Consider a Hilbert spaced with a POV measure F( A) on R .Jhenthere 
exists an extended Hilbert space X = X with a spectral measure F(A) on W, 
such that 

F(A) = *£(A)|* 

$ being the projection operator on X with range X. 

(2). Let F(A) be covariant with respect to a continuous unitary representa- 
tion C/(a, R) of the Euclidean group on X. Then there exists a continuous 
unitary representation U(a, R) on X which extends [/(a, R), i.e. 

[U(a,R),&] = 0, U(a,R)\*=U{a,R) 

such that F(A) is covariant with respect to U(a, R).* 

(3). An extension X of X as described under (1) is called minimal if X is 
spanned by vectors of the form F(A)A, with arbitrary regions A and arbitrary 
vectors A e X. The space X and the spectral measure F(A) (and, for covariant 
F(A), also the representation £7(a, i?)) of a minimal extension are unique up to 
unitary equivalence. A non-minimal extension X contains a subspace which 
reduces the spectral measure if (A) (and the representation U(a, R), if F(A) is 
covariant), and which is a minimal extension of X. 

This Theorem is also true for POV measures on arbitrary spaces (instead of U ) 
and for more general co variance groups. In this general form, parts (1) and (3) 
of the Theorem are due to Neumark (1943) [see also Riesz and Nagy (1956)] 
whereas part (2) has been proved recently by Neumann (1972). 

If one wants to construct a covariant POV measure F(A) on a Hilbert space 
X with a given representation U(a,R) of the Euclidean group, Theorem 1 
suggests the following procedure : First, look for a suitable extension C/(a, R ) of 
U(a, R) to a larger Hilbert space X, such that on ^ there exists a spectral 
measure E(A) which is covariant with respect to U{a,R); then, take F(A) = 
<&iJ(A)|*. It is easily shown that this F(A) is indeed a covariant POV measure, 

*If one is dealing with particles of half-integer spin, then U and U are representations not of the 
Euclidean group itself but of its covering group [cf., for example, Wightmann (1962)]. 

Krans 315 

whereas Theorem 1 quarantees that every covariant POV measure may be 
constructed in this way. A further advantage of this construction is the fact that 
all representations^ U(a, R) of the Euclidean group which admit a covariant 
spectral measure F(A) are explicitly known up to unitary equivalence: 

Theorem 2: Consider a Hilbert space X with a continuous unitary represen- 
tation U(a, R) of the Euclidean group and a covariant spectral measure is (A) 
on U 3 . Then there is a unitary transformation which brings X, U(a, R) and F(A) 
to the following standard form: 

(1). X consists of all complex 'vector' functions f (k) of a real three-vector k, 
with 'vector' components / a (k), a el (some finite or infinite index set), which 
are square-integrable in the sense that 

fd 3 kl |/ a (k)| 2 <oo 

The inner product in X is 

(f,r)=fd 3 ki/ a (k)/;i 


(2). C/(a,i?)isgivenby 

(#(«, *)/)«.*) = e"' k - I DaedDUiR- 1 *) 



with a continuous unitary representation of the rotation group by matrices 
D(R) with matrix elements D a0 (R), a, (3 el. 

(3). F(A) is the spectral measure of the self-adjoint 'position' operator 

X = [ x dF(x) 

whose components X t are defined by* 

(A/) a (k) = /— / a (k) 

This Theorem plays the crucial role in the paper of Wightman (1962), where a 
detailed proof is given. 

Representations t/(a, R) of the Euclidean group of the form (85) are highly 
reducible. First of all, the unitary representation D(R) of the rotation group 
may be decomposed in the usual way into irreducible representations D S (R) 
with fixed angular momentum quantum number ('spin') S,t of dimension 

*In the 'position' representation, i.e. in terms of the Fourier transforms f (x) of i(k), the operators A} 
and E(A) act as multiplication by x f and # a (x), respectively. 

tin our case each 5 is integer, whereas half-integer S occur in the case mentioned in the footnote to 
Theorem 1. 

316 Uncertainty Principle and Foundations of Quantum Mechanics 

25 + 1, which decomposes £/(a, R) into subrepresentations U s (*, R). For each 
subrepresentation U s (a, R), the representation space may be further decom- 
posed into 25 + 1 subspaces with definite helicities -5, -5 + 1 ... 5-1,5, 
which further reduce U s (a,R) since helicity is Euclidean invariant* (For 
details see, for example, Amrein (1969).) On the other hand, the representa- 
tion U(a, R) on the photon state space X may be decomposed into subrep- 
resentations with helicities +1 and -1 (cf. Section 2). Thus Theorem 2 forbids 
the existence of a covariant spectral measure on X which would require, at 
least, the presence also of helicity zero states. The simplest way of extending 
U(a',R) to a representation t/(a, R) of the type (85) is, therefore, to add just 
this 'missing' subrepresentation of helicity zero. This leads to a representation 
{/(a, R) of the form (85), with D(R) = R irreducible and belonging to 5 = 1, 
and exactly this has been done in Section 3. In the light of Theorems 1 and 2, 
therefore, the construction of F(A) in Section 3 appears quite natural. 

However, it is obvious now that this construction is only the simplest but not 
a unique one. There are very many different possibilities of embedding C/(a, R) 
into representations of the form (or unitary equivalent to) (85), which in 
general lead to different POV measures. We will illustrate this non-uniqueness 
by two simple examples. 

For instance, we can embed U(*, R) by adding, besides the missing helicity 
zero states, two other subrepresentations of helicities +2 and -2, so that we 
obtain a representation £/'(», R) with D'(R) belonging to 5 = 2. (The primes 
serve to distinguish the present construction from the one considered in Section 
3.) This representation may be realized concretely in the Hilbert space #?' of 
complex symmetric traceless second-rank tensors g^k) (i, j = 1, 2, 3) with the 
inner product 

(g,g') = \d\g ii (k)g' il (k) 


(sum convention) and the tensor transformation law 

(&(a, R)gUk) = e-' k -i?, r i? /r g JT (i?- 1 k) (87) 

The photon state space #f of transverse vector functions A(k) may be embed- 
ded isometrically in $" by identifying a given Ae W with the tensor 

s* (k) =;M» A ' (k)+ » A ' (k) ) 


in fP. As easily checked, the transformation law (87) for tensors g„ of the 
particular form (88) is equivalent to the vector transformation law (39) for A, 
so that, by the embedding (88), C/(a, R\ becomes a subrepresentation of 
t/'(a, R). The projection operator $' on $" with range #f transforms a given 

*The subrepresentations with definite helicities are still highly reducible, since the absolute value of 
momentum P = k is also Euclidean invariant. 

Kraus 317 

tensor g if e $" into a vector A e #? with components 

A,(k) = V2 (-^ gi/ (k)— M^k)) 




Since this extension $C of $f is of the form required in Theorem 2, the 
self-adjoint operator X' = i V k with covariant spectral measure F' (A) exists on 
$C, and leads to a covariant POV measure 

and a position operator 

F'(A) = 4>'F'(A)| S 

X' = $'X'^ 



on the photon states space ffl. A straightforward calculation with (88), (89) and 
(91) leads to 

x '> A -(i^ A ) 

for an arbitrary A e $f. This coincides with X/A as given by (43), and thus 
X' = X. On the other hand, the POV measure F(A) is different from F(A) as 
constructed in Section 3 (see below). Since, therefore, 


X = jxdF(x) = jxdF(x) 


we have here an example for the non-uniqueness of the POV measure 
corresponding to a given operator X. 

As an example for a covariant POV measure F'(A) for which the correspond- 
ing position operator X" is different from X, consider 

F'(A) = F + F(A)F + +F_F(A)F_ 


with F(A) as in Section 3 and the projection operators E ± onto the subspaces 
$f± of #f belonging to the helicities ± 1 . Obviously (92) defines a POV measure, 
whose covariance follows from [F±, U(a, R)] = 0. The corresponding position 
operator is 


x dF'(x) = E + XE + + F_XF_ 


and is different from X since [X, F±] ^ (as easily checked by direct calcula- 
tion) whereas, clearly, [X", E ± ] = 0. 

Both examples may be easily generalized. Embedding of U(b,R) into 
U s (a, R) with 5 = 1, 2, 3 . . . , as described above for 5 = 1 and 2, leads to an 
infinite sequence of covariant POV measures F S (A), 5 = 1, 2, 3 ... , which are 
all different. We can simply show this as follows. It is known (and follows easily 

318 Uncertainty Principle and Foundations of Quantum Mechanics 

from Theorem 2) that, for am/ given 5, all unitary operators U s (a, R) together 
with all spectral projections F S (A) form an irreducible set of operators on the 
representation space H s . Statement (3) of Theorem 1 then implies that the 
extensions $t s of M are minimal and, consequently, that Fs(A)^F s .(A) if 
5 ¥■ S'.* Instead of (92) we may consider, more generally, 

F'(A) = SAfsF s (A)A« 



with F S (A) as above and (finitely or infinitely many) operators A iS satisfying 

[A iS ,U(a,R)] = 0, lA*sA iS -- 



It may be proved from (95) that (94) indeed defines a covariant POV measure.t 
Since the representation U(a,R) is highly reducible, there are very many 
different sets of operators A iS which satisfy (95) and which, in general, will also 
lead to different POV measures F'(A). By Theorem 1, each F'(A) may also be 
obtained from a suitable extension of W and U(a, R). However, except for 
particularly simple cases like (92), such an extension is expected to look rather 

We have seen that the covariant POV measure F(A) =Fj(A) constructed in 
Section 3 is very far from being unique. On the contrary, the diversity of 
possible candidates for a photon position observable might appear really 
bewildering. Moreover, any covariant POV measure leads to Heisenberg's 
position-momentum uncertainty relation (63) and to the photon velocity 
operator V given by (73), and is thus acceptable as a photon position observa- 
ble also from this point of view. This follows from Theorem 1 by a straightfor- 
ward generalization of the reasoning applied in Section 4. One could try to 
reduce this non-uniqueness, as done by Wightman (1962) for the particular 
case of spectral measures, by exploiting suitable additional postulates like time 
reversal invariance and 'smoothness' in momentum space. With such addi- 
tional assumptions Wightman was able to prove uniqueness of the Newton- 
Wigner position observables. However, one cannot hope to obtain uniqueness 
by this method in the case of general POV measures unless Wightman's 
additional postulates are sharpened considerably because, for example, all 
F S (A) and many F'(A) of the form (94) are both time reversal invariant and 
'smooth' in momentum space. 

Therefore we do not believe that the 'true' photon position observable, i.e. 
the one which describes real position measurements, can be determined by 
purely kinematic considerations. In this respect we fully agree with Wightman 
(1962), who wrote: 'All investigations of localizability for relativistic particles 
up to now . . . construct position observables consistent with a given transfor- 
mation law. It remains to construct complete dynamical theories . . . and then 

*Presumably, however, the corresponding position operators X s are all equal. (At least X! = X 2 , 

see above.) 

tThis is trivial if one is dealing with finitely many operators A js . 

Kraus 319 

to investigate whether the position observables are indeed observable with the 
apparatus that the dynamical theories themselves predict.' At present the only 
candidate for a 'complete dynamical theory' of elementary particles is quantum 
field theory, and photon localization experiments should thus be investigated in 
the framework of quantum electrodynamics if one wants to go beyond pure 
kinematics. Such an investigation is also expected to allow a more profound 
treatment of the causality problem mentioned at the end of Section 4. Since 
quantum electrodynamics describes photons as 'quanta of a vector field', we 
are tempted to speculate that the 'vector' POV measure F t (A) of Section 3 
might be distinguished from this point of view. An additional argument in 
favour of F t (A) is simplicity. It is therefore not unreasonable to consider Fi(A), 
in spite of its non-uniqueness, as describing actually realizable photon detec- 

The generalization of the present discussion to other elementary particles is 
almost obvious. For a massive particle, for instance, one finds that there exist 
infinitely many generalized position observables besides the usual Newton- 
Wigner position operator. The latter, however, is distinguished by the fact that 
it is the only 'ordinary' position observable.* For this reason, the non- 
uniqueness problem appears not to be so serious in this case. Like the photon, 
also the neutrino does not possess an 'ordinary' position observable (Wight- 
man, 1962) whereas it is very easy to construct, via Theorems 1 and 2, 
generalized position observables. Again one of them is distinguished by 
simplicity. It is an additional advantage of the present approach as compared to 
the one of Amrein, Jauch and Piron that the latter does not provide a 
theoretical description of neutrino position measurements (Amrein, 1969). 


I would like to thank Georg Reents and Michael Everitt for critical readings of 
the manuscript. 


Ali, S. T. and Emch, G. G. (1974) 'Fuzzy observables in quantum mechanics', /. Math. Phys., 15, 

Amrein, W. O. (1969) 'Localizability for particles of mass zero', Helv. Phys. Acta, 42, 149-190. 
Hegerfeldt, G. C. (1974) 'Remark on causality and particle localization', Phys. Rev., D10, 

Heisenberg, W. (1927) 'liber den anschaulichen Inhalt der quantentheoretischen Kinematik und 

Mechanik', Z. Physik, 43, 172-198. 
Jauch, J. M. and Piron, C. (1967) 'Generalized localizability', Helv. Phys. Acta, 40, 559-570. 
Kraus, K. (1970) 'Note on azimuthal angle and angular momentum in quantum mechanics', Amer. 

J. Phys., 38, 1489-1490. 

*Note that the 'decision effects' form a distinct class of 'effects' in Ludwig's theory (Ludwig, 1970). 
This implies a corresponding distinction of 'ordinary' (so-called 'decision') observables. 

320 Uncertainty Principle and Foundations of Quantum Mechanics 

Kraus K. (1971) 'General state changes in quantum theory', Ann. Phys (N. Y), 64, 311-335. 
Kraus, K. (1974) 'Operations and effects in the Hilbert space formulation of quantum theory , 

Lecture Notes in Physics, (Springer- Verlag), 29, 206-229 / n„„ 1 „„ 

Ludwk G. (1970) 'Deutung des Begriffs "physikalische Theorie" und axiomatische Grundlegung 

deTHilbertraumstruktur der Quantenmechanik durch Hauptsatze des Messens', Lecture Notes 

^:2ZHA?91l^S'syst em s and observables in quantum mechanics', Commun. Math. 

Ne?mann,H. (1972) 'Transformation properties of observables', Helv. Phys. Acta, 45, 811-819 . 
NeumaX M. A. (1943) 'On a representation of additive operator set functions', Doklady Acad. 

Nekton! 1 ? D 4 ana 5 WigSer, E. P. (1949) 'Localized states for elementary systems', Revs. Mod. 

RifsM^nd Sz.-Nagy, B. (1956), Vorlesungen uber Funktionalanalysis, Anhang. Berlin: VEB 

Deutscher Verlag der Wissenschaften. 
Wightman, A. S. (1962) 'On the localizability of quantum mechanical systems , Revs. Mod. Phys., 

Wigner, e7(1939) 'Unitary representations of the inhomogeneous Lorentz group', Ann. Math., 40, 

A New Theoretical and Experimental Outlook on 
Magnetic Monopoles 


Universita di Catania, Italy 



Universita dell' Aquila, Italy 

Since experiments looking for magnetic monopoles have failed until now, and 
new experiments are going on, it should be interesting to know — and to take 
into account — the predictions of the mere special relativity on the subject. 
We are going to show that the mere special relativity: 

(1). Does not explicitly predict the existence of (slower- than-light) magnetic 

(2). It does explicitly predict, on the contrary, the existence of tachyonic (i.e. 
faster-than-light) 'monopoles'; 

(3). Their unit magnetic-charge appears predicted to be about one hundred 
times less than that usually assumed (Dirac, 1931, 1948; Schwinger; 1966) 
(g = ±e, in Gaussian units); 

(4). Many good features of the old hypothesis about magnetic monopoles 
(Dirac, 1931, 1948; Schwinger, 1966) are reproduced by simply taking account 
of Superluminal (v 2 > c 2 ) speeds. In particular, the existence of both sublumi- 
nal (v 2 <c 2 ) and Superluminal 'electric' charges leads to fully symmetrical 
Maxwell equations (Mignani and Recami, 1974b), cf. equation (1) in the 
following, and possibly to the Schwinger -type relation: eg = nah. 

In fact, let us build anew the theory of special relativity without assuming a 
priori \v\<c (Recami and Mignani, 1974a). In other words, let us start from the 

(1). Principle of relativity: the laws of mechanics and of electromagnetism 
are covariant under a transition between two inertial frames, whose relative 
speed u is a priori — oo<u < +oo. 

(2). Space is isotropic and space-time homogeneous. Moreover, negative- 
energy particles do not exist and for every observer, physical signals are 
transported only by positive-energy objects. The usefulness of the last 


322 Uncertainty Principle and Foundations of Quantum Mechanics 

sentence— even in standard Relativity!— has been shown by us, e.g. in Recami 
and Mignani (1974a). 

There follows an 'extended relativity' (Recami and Mignani, 1974a and 
references therein), in which light speed is invariant with respect to all inertial 
frames, both subluminal (s) and Superluminal (5), and in which tachyons do not 
imply (see, e.g., Recami, 1973; Pavsic, Recami and Ziino, 1976) any causality 
violation. What is more, the 'extended relativity' proved to be useful even for 
standard particle physics, since for example it allowed the derivation of the 
'crossing relations' for the relativistic reactions (Mignani and Recami, 1974e), 
and the CPT theorem (Mignani and Recami, 1974d). It leads, incidentally, to 
suitable redefinitions of the discrete symmetries. 

The point we want to stress here is the following. If we consider the existence 
of electric charges both subluminal [with four-current ;^(s) = (p(s), j(.s))] and 
Superluminal [whose four-current is / M (S) = (p(5), j(S))], then the 
(generalized) Maxwell equations read (Mignani and Recami, 1974b), for 

??c 2 

divD = +p(s) 
divB = -p(S) 
TotE = -dB/dt + }{S) 
rotH=+dD/df + j(s) 

[v Sc ; 




That is to say, faster-than-light electric charges are predicted by (extended) 
relativity to behave in a similar way as magnetic monopoles were supposed to 
do (apart from the different speed!)-cf . also Figure 1 in Mignani and Recami 
(1974b) In fetter words, a Superluminal electric, positive charge (e.g. with 
speed V > c, along the x axis) will bring into the field equations a contribution 
similar to that which was supposed to come from a magnetic south pole (with 
v = c 2 / V, along x), (Recami and Mignani, 1974a). Thus, 'tachyonic electrons 
will appear with north magnetic charge (+g), and 'tachyonic protons' with 
south magnetic charge (-g); and so on. Therefore, a Superluminal unit electric 
charge e will appear to us as a (tachyonic) 'monopole' with possibly the unit 
magnetic charge: 

g = - e (in Gaussian units) (2) 

so that in general we expect to have (when quantizing): 

eg = ±ah w) 

where a is the fine-structure constant. It follows that relativity seems \o predict a 
magnetic strength unit about 100 times less than that usually assumed. 

In other words, extended relativity predicts only one charge (let us call it 
'electromagnetic charge'), which behaves-if you hke-aS 2 'electric' when 
subluminal (v 2 <c 2 ) and as 'magnetic' when Superluminal (v >c). Cf. again 
Figure 1 in Mignani and Recami (1974b). Also, Maxwell's equations may be 

Recami and Mignani 323 

written in a fully symmetrical form [cf. equation (1)], without assuming (sub- 
luminal) monopole existence. 

What is more, the universality of electromagnetic interactions is recovered in 
extended relativity, since |g| = |«|, i.e. only one coupling constant essentially 
exists in our framework even before quantizing the theory. 

When passing to quantum mechanics, on the contrary, we can say the 
following. If one assurAes the existence of subluminal magnetic monopoles, 
then the simultaneous quantization of both electric and magnetic charges 
follows. This might suggest that even subluminal magnetic monopoles could 
exist, with their large unit charge. Notice, however, that the previous argument 
would be the only one in favour of subluminal magnetic charges, since, for 
example, in the present theory Maxwell equations already have a fully sym- 
metrical form (moreover, that argument would become even weaker if we 
actually succeed — when quantizing our theory — in deriving a relation like 
equation (3), which would also yield too a charge quantization). 

Here, let us mention only the following, in order to support our equation 
(2): (i) The Dirac relation eg = nh/2 (or the analogous one by Schwinger) does 
come in the theory only when magnetic monopoles are supposed to be 
subluminal ; (ii) On the contrary, if magnetic monopoles are considered to be 
Superluminal, then 'extended relativity' seems to yield the alternative relation 
g = ne. 

In fact, let us eventually quantize our theory by using Mandelstam's method, 
i.e. by following Cabibbo and Ferrari (1962). In that approach, the field 
quantities describing the charges (in interaction with the electromagnetic field) 
are defined so that: 

4> (x, P') = <f> (x, P) • exp [ - 1 J F^ dcr M „J 

where S is a surface delimited by the two (space-like) paths P, P' considered, 
ending at point x. In other words, the field quantities <f> are independent of the 
gauge chosen for the fourpotential A^ but are path-dependent. When only 
(subluminal) electric charges are present, then F M „ = A v/lli -A^^ and equation 
(4) does not depend on the selected surface S (but depends merely on its 
boundary P-P'). However, if subluminal magnetic monopoles are present 
too, then F M „ = A v/ll -A^ v - ie^B^,, (where B^ is a second fourpotential), 
and the following condition must be explicitly imposed: 



wherefrom Dirac 's relation eg = nh/2 follows. At this point, it is immediate to 
realize that, if 'magnetic monopoles' cannot be put at rest, as in the case of 
tachyon monopoles, then equation (4) is again automatically satisfied, without 
any recourse to Dime's condition. 

324 Uncertainty Principle and Foundations of Quantum Mechanics 

According to (extended) relativity, all the experimental searches for magnetic 
monopoles should be done, or redone, by actually looking for 'tachyon 
monopoles'; i.e. taking into account the newly proposed kinematics (faster- 
than-light speeds) and the possibly much lower value of the apparent magnetic 
strength. In particular, the 'tachyon monopoles' will probably suffer in an 
electromagnetic field the 'Lorentz force' F = gH-gVAE, where however, 
V 2 >c 2 . Actually, Bartlett and Lahana (1972) have tried already to look for 
'tachyon monopoles', but in vain because the basis of their theoretical 
assumptions — Cherenkov radiation supposedly emitted by tachyons in vacuum 
(!) — is incorrect as has been shown by us (Mignani and Recami, 1974a). More 
details can be found in Recami and Mignani (1976) and in the proceedings (to 
appear) of the interdisciplinary seminars on 'Tachyons and Related Topics' 
delivered at ERICE (September, 1976). 


The authors are grateful to Dr. S. Chissick and to Dr. E. Papp for their kind 


Bartlett, D. F. and Lahana, M. D. (1972) Phys. Rev., D6, 1817. 

Cabibbo, N. and Ferrari, E. (1962) Nuoro Gmento, 23 1147. 

Dirac, P. A. M. (1931) Proc. Roy. Soc., A133, 60. 

Dirac, P. A. M. (1948) Phys. Rev., 74, 817. 

Mignani, R. and Recami, E. (1974a) Lett. Nuovo Gmento, 9, 362. 

Mignani, R. and Recami, E. (1974b) Lett. Nuovo Gmento, 9, 367. 

Mignani, R. and Recami, E. (1974c) Lett. Nuovo Gmento, 11, 417. 

Mignani, R. and Recami, E. (1974d) Lett. Nuovo Gmento, 11, 421. 

Mignani, R. and Recami, E. (1974e) Nuovo Gmento, A24, 438. 

Recami, E. and Mignani, R. (1976) Physics Letters, 62B, 41. 

Recami, E. (1973) Annuario 73, Enciclopedia EST-Mondadori (Milano), p. 85. 

Recami, E. and Mignani, R. (1974a) Rivista Nuovo Gmento, 4, 209-290; [Erratum] 4, 398. 

Recami, E. and Mignani, R. (1974b) Lett. Nuovo Gmento, 9, 479. 

Pavsic, M., Recami, E. and Ziino, G. (1976) Lett. Nuovo Gmento, in press. 

Schwinger, J. (1966) Phys. Rev., 144, 1087. 


Problems in Conf ormally Covariant Quantum Field 

W. RUHL and B. C. YUNN 

Universitat Kaiserslautern, Germany 


The conformal group appeared in physics as early as 1909 when Cunningham 
(1909) and Bateman (1910) first noticed that Maxwell's equations are not only 
Lorentz covariant but also covariant under the larger conformal group. This 
consists of the usual Lorentz transformations, translations, dilations and 
special conformal transformations. Since then many attempts have been made 
to somehow utilize this group in physics (Kastrup, 1962, 1964; Wess, 1960; 
Fulton and coworkers, 1962; Mack and Salam, 1969). 

We are particularly interested in the possibility of constructing a local 
quantum field theory which is also conformally covariant. One particular 
feature of such a field theory is that it possesses global operator product 
expansions of the Wilson type (Wilson, 1969). This may have far-reaching 
consequences in more realistic field theories involving non-zero masses as well. 

The requirement of the conformal symmetry is so strong that the most 
general two- and three-point functions are determined completely up to 
arbitrary normalization constants. Therefore, for example, their analytic struc- 
tures can be studied unambiguously. So far there are two non-perturbative 
approaches of analysing a conformally covariant quantum field theory. One is 
the so-called bootstrap approach which tries to construct general n -point 
functions from the skeleton graph expansion using the conformally covariant 
two- and three-point functions. This was initiated by Migdal (1971), Mack and 
Todorov (1973) and Polyakov (1969). In this approach one can indeed prove 
that every term in the expansions is ultraviolet convergent if one restricts the 
anomalous dimensions of the fields to a certain range. Thus the construction is 
term by term conformally covariant. The dimensions and coupling constants, 
however, are not free parameters, instead they are determined from self- 
consistency conditions which arise from integral equations for the two- and 
three-point functions. The main drawback of this bootstrap approach lies in the 
inability to handle the infinite series appearing in the expansions just as in 
conventional perturbation theory. 


326 Uncertainty Principle and Foundations of Quantum Mechanics 

Another approach adopted by Mack (1974) to avoid this difficulty starts by 
writing down an infinite number of coupled integral equations for Euclidean 
Green's functions and solving them by making use of conformal partial wave 
expansions. A remarkable observation is that in this way one can diagonalize 
the whole set of integral equations and thereby reduce them to a set of algebraic 
equations for the partial wave amplitudes. A careful analysis shows that these 
waves must possess some poles, and the factorization property of their residues 
also follows when one considers them as analytic functions of the representa- 
tion parameters. This in turn enables one to derive asymptotic operator 
product expansions with a certain additional assumption. This program, inten- 
sively pursued in recent years by Dobrev and coworkers (1975a, 1975b), has 
also some intrinsic difficulties. In particular imposing crossing symmetry on the 
partial wave amplitudes is difficult to carry out. In any case it remains to be seen 
whether the latter difficulty is easier to handle than the infinite summation 
problem in the bootstrap approach. 

Formulating the conformally covariant quantum field theory directly in 
Minkowski space provides additional difficulties connected with the fact that 
the conformal group is larger than the group of causal automorphisms of 
Minkowski space. The difficulties already manifest themselves in free field 
theories. Until recently it was considered that one was either forced to move on 
Euclidean space or to restrict oneself only to infinitesimal transformations thus 
inventing a terminology like 'weak conformal invariance' (Hortacsu and co- 
workers, 1972). Detailed studies (Schroer and Swieca, 1974; Kupsch and co- 
workers, 1975), on the free fields and some explicitly soluble interacting field 
theories in two-dimensional space-time has made it possible now to under- 
stand the structure involved in it. It is generally accepted that the necessity of 
the universal covering group of the conformal group is essential. The fields are 
subjected to a non-local Fourier decomposition on the centre of this group 
which generalizes the concept of decomposition in a creation and an annihila- 
tion part of the free field. Operator product expansions in Minkowski space 
have to be studied not in terms of fields but in terms of these non-local 
projections. This makes the whole scheme very complicated. An interesting 
question is whether one can somehow recombine all these components into a 
local expression, and also the question of the convergence of the operator 
product expansion is well worth pursuing further. 

Our plan is as follows. In Section 2 some difficulties associated with the fact 
that the conformal group does not preserve the causal structure of the 
Minkowski space Mr, (D = dimension) are discussed. The universal covering of 
a compactified Minkowski space (denoted y£) with its causal structure that is 
locally isomorphic to that of M D and is invariant under the universal covering 
group of the conformal group is introduced and the possibility of defining a field 
theory on M£ is also discussed. The transformation properties of quantized 
fields are investigated in Section 3. A significant role is played by the generating 
element Z of the centre in formulating the non-local decomposition of the field 
operators. The Thirring model is introduced as an explicit example of a 

Riihl and Yunn 327 

conformally covariant field theory. In the subsequent Section 4 this model is 
used in our study of operator product expansions in Minkowski space. In 
two-dimensional space-time the conformal group and its covering are small 
enough to carry out necessary computations explicitly and this makes it easy to 
show the local structure of operators appearing in the expansions. In Section 5 
we try to develop some general model independent ideas on the operator 
product expansions. 


Causality as a geometric concept is a partial ordering in Minkowski space, or 
more general of a manifold. We are interested mainly in the Minkowski space 
M 4 , but for the sake of constructing models other Minkowski spaces M D with 
the characteristic form 


-r 2 - 


_ r 2 


are also of importance. Automorphisms of the manifold, in particular of M D , 
that together with their inverses preserve the causal ordering are called causal 
automorphisms. Zeeman's theorem (Zeeman, 1964) asserts that the group of 
causal automorphisms of M 4 (relative time-like pairs of vectors are ordered 
into an 'earlier' and a 'later' vector, relative space-like pairs are not ordered) 
consists of orthochronous Lorentz transformations, translations and dilations. 
This group we call the 'Weyl group'. The conformal group possesses the Weyl 
group as a proper subgroup and thus violates the causal ordering. Zeeman's 
theorem can be generalized to Minkowski spaces with D > 2 easily. It has for a 
long time been interpreted as forbidding any extension of the space-time 
symmetry beyond the Weyl group, in particular excluding any internal sym- 
metry combined with space-time symmetry. 

The conformal group consists of products of inhomogeneous Lorentz trans- 
formations, of dilations 

x% = \x», A>0 (2) 

and of special conformal transformations 

x» = <T(b,xy 1 (x" + b»x 2 ) (3) 

a(b,x) = l + 2b tL x' i + b 2 x 2 (4) 

It is obvious that M D is not a homogeneous space for these transformations, 
since whatever we choose for b in (3), there are vectors x e M D for which 
cr(b, x) = 0. Compactifying M D evades this problem but leads to a manifold 
that trivially does not possess a causal ordering which extends the causal 
ordering of M D , since the time axis is nowclosed at infinity. 

A useful parametrization of M D and M D , the compactification of M D , is 
defined as follows (Riihl 1975). Introduce the Hermitian 2x2 matrix 

X=X°O- + X€T, o- =1 (5) 

328 Uncertainty Principle and Foundations of Quantum Mechanics 


U = (o- +iX)(o- -iX) * 

U=e iv/2 u, detH = l, 0=£<p<2ir 



(for M 3 (Ma) define Xl (x t and x 2 ) to be zero). Then the compactification M D 
obtained from M D by adjoining all U with det(o- + U) = 0. As can be seen from 
(7) and the parametrization 

« = «°(r„+n<T, («°) 2 +u-u=l (8) 

« D has the topological structure 

§iXS D -i 


This is also true f or D > 4. 

Despite these difficulties with causality physical models are known that are 
quantum field theories in the proper sense and exhibit conformal covariance 
with an invariant vacuum, namely the free massless operator fields. Beyond 
these free theories a few models with interactions in M 2 are known to be 
conformally symmetric. 

In the framework of quantum field theory and free field theories are limiting 
cases as comes out in the following fashion (Ruhl, 1973). We assume that a 
conformally covariant quantum field theory in the sense of Wightman is given, 
the vacuum is invariant. We consider the state 



where 3>i(x) is any spinor field operator. Due to the spectrum condition this 
state can be analytically continued in x into the tube domain which is a 
homogeneous space for the conformal group and its universal covering group. 
If the conformal group acts on it, it transforms as an analytic representation, i.e. 
a representation of the discrete series (Ruhl, 1973; Mack, 1975). Such rep- 
resentations are labelled by three parameters: j u j 2 , d, where j x (j 2 ) is the 
undotted (dotted) spin as for a spinor representation of SL(2, C), and d is the 
'dimension' of the field. Since one wants d to assume not only integral or 
half-integral values, one has to study the universal covering group of the 
conformal group. This group is denoted G D in the following. 
The invariant two-point function 

<Omx)m(y)T\0) =\y\ 2 S5&x -y) 


is fixed by group theory alone up to a positive normalization constant to be an 
intertwining operator for the discrete series representations of G D . It is a 
homogeneous distribution in x-y of degree -2d. The requirement that this 
distribution is positive, is equivalent with the requirement that the discrete, 
series representation involved admits an invariant norm in a Hilbert space, i.e. 

Ruhl and Yunn 329 

it is unitary. This entails that the dimension d is bounded from below, namely 

d>j 1 +j 2 +2 if/i/ 2 #0 (12a) 

</>/'i+/2 + l if 7172 = (12b) 

At the lower bound (12b) of d degenerate representations appear that, as we 
shall see, belong to free fields. 

In fact, from (11) we deduce the vacuum expectation value of the causal 
commutator (or anticommutator). The usual connection between spin and 
statistics can be verified. If d assumes the lower bound value (12b), the 
commutator (anticommutator) function assumes the canonical form for a free 
massless field. Due to the theorem of Jost, Schroer and Pohlmeyer (Jost, 1961; 
Pohlmeyer, 1969), the field itself is a free massless field in this case. 

Another conclusion can be drawn from (12b) and the homogeneity of the 
two-point function. A conformally covariant quantum field involves asympto- 
tic states carrying particles if and only if it is a free massless field. This reduces 
the value of such field theories considerably. We can either regard them as 
models of academic interest only, some of whose properties can hopefully be 
carried over to more general quantum field theories with particles, or they 
appear at best as limiting theories in the Gell-Mann-Low sense of realistic 
quantum field theories (Gell-Mann and Low, 1954). 

For free fields causality does not cause any problem. Their commutator 
(anticommutator) is a number-valued distribution and this is conformally 
covariant. All n -point functions can be expressed by these two-point functions 
and are automatically covariant. For a deeper inspection we make the ansatz 



i.e. a 'local transformation' under the conformal group. It involves a singular 
multiplier n(g, x) as soon as special conformal transformations (3), (4) partici- 
pate in the group element g. It involves a negative power of <r{b, x) (for the 
exact expression see Section 3) and thus is singular whenever x g (3) is singular. 
In the case that we project both sides of (13) on the vacuum from the right, the 
singular multiplier is a boundary value of an antiholomorphic function in the 
forward tube domain, and if we project it on the vacuum from the left, we 
obtain a boundary value of a holomorphic function on the forward tube 
domain. It follows that the singular multiplier in (13) cannot be given a unique 
meaning, since the multipliers are different in either case. For free fields the 
projections on the vacuum can equivalently be performed by decomposing the 
field into its positive and negative frequency parts and to write an equation of 
the type (13) for each part separately. As multipliers we use the appropriate 
boundary values. As we shall show in the subsequent section, in the general 
case the field operator has to be harmonically analysed on the centre of G D 
instead, and each Fourier component transforms as (13) with its specific 
multiplier composed of both boundary values in general. 

330 Uncertainty Principle and Foundations of Quantum Mechanics 

Though it is not necessary it is quite useful both for technical and illustrative 
purposes, to formulate a conformally covariant quantum field theory by 
maintaining the form of the transformation law (13) without Fourier decom- 
position of the field and with a unique multiplier, by introduction of fields on 
the universal covering space Mg. This amounts essentially to letting <p in (7) 
assume all real values from -oo to +oo. Instead of (9) we get the structure 

RiXSo-! ( 14 ) 

In the case D = 2 we have to take the universal covering with respect to both 
factors Si and thus obtain 

In the latter case U (7) is a diagonal 2x2 matrix 

/e' v+ \ 
Mo e-J 


*f = *°±* 3 



In this case we let both <p ± assume all real values. 

The manifold M"£ has a remarkable property first discovered by Segal (Segal, 
1971- Mayer 1974). It possesses a conformally invariant (under G D ) causal 
ordering in the sense described above. For D>2 MS possesses an infinite 
number of sheets labelled n = 0, ±1, ±2, .... A space-like vector of the zeroth 
sheet can be mapped by continuous variation of the group element on other 
space-like vectors on the same sheet, onto a point at infinity, and further on 
points of the ± -first sheet that lie over time-like vectors of the zeroth sheet. If 
we identify these points on the ± -first sheet with the points of the zeroth sheet, 
we have transformed space-like into time-like vectors and thus violated the 
causal ordering of M D (the second vector of the pair can be taken to be the 
null-vector). However, on MS we may call these points on the ± -first sheet 
obtained from space-like points on the zeroth sheet by means of special 
conformal transformations also space-like. It is then easy to see that the 
remainder of the manifold MS can be cast into a future and a past submamfold 
plus a light-cone. Thus we have succeeded in extending the causal structure 

from M D onto MS- uc . ,_ . 

This way it is possible to define a quantum field $(£) on Mb with x over x on 
the nth sheet, such that locality can be formulated by 

[*(£), *(y)] = (18) 

whenever x, y are relatively space-like say for a scalar field. This locality 
condition (18) can be postulated to be invariant under G D . 
Any Wightman m -point function 

<0|*i(*i)* 2 (x2).-.*m(*m)|0> (19) 

Ruhl and Yunn 331 

is independent of the sheet number n if all fields are defined on the same sheet. 
Field theories on fixed sheets are isomorphic. In fact there is a unitary operator 
Z so that 

^{x) = Z n ^ n ^{x)Z' 


if x (x) lies on the n^st (n 2 -nd) sheet over x. Any conformal transformation 
that does not lead any of the arguments of the m -point function (19) out of its 
sheet, leaves (20) unaltered. 

From this one can deduce that Z commutes with all conformal transforma- 
tions and thus represents an element of the centre of G D . In fact, it represents 
the generating element of the centre. Of course, any local observable should be 
identical on all sheets and thus commute with Z 

Finally all Wightman m -point functions with arguments on arbitrary sheets 
can be obtained from the same function with all arguments on the zeroth sheet 
by analytic continuation. This and the previous assertions can be proved 
(Liischer and Mack, 1975) by first requiring conformal covariance only under 
infinitesimal transformations, then continuing the Wightman (or time-ordered) 
functions into the Euclidean domain, where the generators of the conformal 
group can be implemented easily to the Euclidean conformal transformations, 
finally continuing back to the Minkowskian boundary, which then turns out to 
have the sheet structure just described. 

In the case D = 2a few alterations are necessary. There is a doubly infinite 
sequence of sheets n + = 0, ±1, ±2, . . . and «_ = 0, ±1, ±2, . . . and correspond- 
ing operators Z+ and Z_ as the intersheet isomorphisms. In this case G D is the 
direct product of two groups 

Go = SU(1, D" c x5C7(l, 1)™ (21) 



The first (second) group acts on <p+(<pJ), according to 

e * = — — = 


<p g = <p + 2arg(a+/?e~'' p ) 

where arg a is allowed to range over 

-oo < arg a < +oo 

in order to obtain the universal covering group of SU(1, 1). 


Physically relevant unitary irreducible representations of G 4 , the universal 
covering group of the conformal group, can be constructed in the usual fashion 
by inducing them from an appropriate subgroup. Requiring that these rep- 
resentations be realizable on spaces of functions of vectors x in Minkowski 

332 Uncertainty Principle and Foundations of Quantum Mechanics 

space i e by classical fields, we are led to consider the stability subgroup of 
these 'vectors. For x = this subgroup consists of homogeneous Lorentz 
transformations, of dilations, and of special conformal transformations The 
representations obtained in this fashion have been classified as follows (Mack 

and Salam, 1969). 

Let /<„ be the generators of the special conformal transformations (3) 
represented by a matrix acting on the classical field at x = 0. Then K „ may be 
identically zero, this type of field representation is called la in Mack andSalam 
(1969) Secondly #c M may be non-zero but a finite dimensional matrix. Then it 
has to be nilpotent due to the abelian structure of the subgroup of special 
conformal transformations. Such representations are denoted lb. Finally there 
are infinite dimensional matrices k„ these are denoted type II. Representa- 
tions of type la are the representations almost exclusively encountered in field 
theory If we require in addition that the energy-momentum spectrum be 
restricted to the forward light-cone, we obtain the discrete series representa- 
tions mentioned in Section 2 that were used there for the one-particle states 
(Riihl 1973- Mack, 1975). We mention finally that the group G 2 and its 
representations that are used for the Thirring model in M 2 have been studied by 
many authors. We shall only explain a few notations in this article but otherwise 
refer to an exhaustive presentation in the literature (Riihl and Yunn, 1975a). 

The transformation property of a conformally covariant spinor field under 
special conformal transformations of G 4 is by representation theory 

U g ^ A6 (x)U- g 1 -<r(b,x)- d - i ^ 

I DH(To+XB) AA 4> AB {x g )DH(r +BX) B . B 


B = b°<r -ba 


for group elements sufficiently close to the unit element. D' denote representa- 
tion matrices of covariant spinor representations of SL(2, C). As pointed out in 
the preceding section, the difficulty consists in interpreting the singular factor 
p-d-h-h, w hich for fixed x and a sufficiently small neighbourhood of the group 
unit of G 4 is regular and well defined by arg a(b, x) = 0. The solution to the 
general problem has been found by studying the Schroer model (Schroer and 
Swieca, 1974) and the Thirring model (Kupsch, Riihl and Yunn, 1975) in M 2 . 
We shall describe it now in general terms. 

We assume that we have a Wightman-type field theory that is conformally 
covariant, i.e. there exists a unitary representation U g of G D with 

U g \0) = \0) 


The generating element of the centre of G D (D >2) be represented by Z. We 
introduce the Fourier component $ T , =£ t < 1 on the centre of G D 

<D T (x)= I Z n <&0c)Z-" e" 



Z<D T (x)Z- 1 = e 2mT $ T (x) 

Riihl and Yunn 333 


We continue the equation (25) away from the group unit so that x on the 
zeroth sheet over x e M D moves over to the first sheet (minus first sheet). We 
consider the two boundary values 

a-JJb, x) = lim a{b, x ± iy) 



where y tends to zero in the forward light-cone. We obtain then on the first 
(minus first) sheet 

arg<r±(b,x) = Tir(±ir) (31) 

In accordance with (29) we make therefore the ansatz 
U^bU; 1 = <r + (b, xp"- h+T a.(b, *)-**-'*-* 

X I D h (<To+XB) AA ® T A - B {x g )DHcro + BX) B . B (32) 

A' B ' 

We can introduce field operators on Mp by setting on the zeroth sheet 
<P AB (x) = 2 2(d " 1) |det (cro- HOI- 1 -''-* 

x I D\Acr -iX)f A . B {U{X))D% B (ao-iX) (33) 

A' B ' 

and requiring a 'local' transformation law 

U g f AB {U)U- s l 

= |det (A f + I/BT^Idet (CU+D)! - *"^ 

x £ D>i{A*+UB*) AA .f A +{U a )D'H£U+D) B . B (34) 

Here A, B, Q D denote 2x2 matrices making up a 4 x 4 matrix 


m ■ 


H=(~°° ° Y m f H = Hm- 1 
\ +cr / 



that belongs to SU(2, 2). The group 51/(2, 2) uc is isomorphic to G 4 . The matrix 
m with 

B = C = 0,A=D = i<r o 
generates the centre of SU(2, 2) uc . We find from (34) 

Zf^iu, <p)Z~ x = e^^W-", <P ~2ir) 


334 Uncertainty Principle and Foundations of Quantum Mechanics 

by means of 

U^iAU+BKCU+D)- 1 (39) 

<p g = ( p-argdet(A t +C/£ t ) 

-argdet(CT/+£>) (40) 

Of course the field operator f A s(U) can also be decomposed on the centre of 
G 4 by a formula such as (28). It follows then from (38) 

= e" r(/W ' ) / T (-w,<p-27r) 
Therefore r(U) can be expanded in the canonical basis 

T(«,<P)= I Z Z <e iw %( M ) 

7=0, |, 1... r, i = -/4 = -» 


5(;'2-/i)+/'-9 = Tmodl 




This expansion is in many cases more advantageous than a Fourier decomposi- 
tion into plane waves on M 4 . 

The central element Z is an element of a one-parameter subgroup of G D and 
as such can be written in the form 

Z = e 2 


The self-adjoint operator T is not uniquely determined but only up to a 
self-adjoint operator with entire eigenvalues, such as a number operator in a 
free field model. T has always the character of an 'anomalous part' of such a 
number operator in the known models. The eigenvalues of T and its corres- 
ponding eigenspaces fix the irreducible components of a local operator. In the 
known models its spectrum is discrete and it has been suggested (Luscher and 
Mack, 1975) that this be so in general. It has also been proved (Luscher and 
Mack, 1975) that the spectrum of T can be assumed to be positive. We shall 
make use now of the hypothesis that the spectrum is discrete. 

If A, denote the eigenvalues of T and n(A,) their respective projection 
operators on the eigenspaces, we define 

It follows then that 

^ KlX =U(\ 1 )^Il(X 2 ) 


and inserting this into (28) 

T = A!-A 2 mod 1 




Riihl and Yunn 335 

and finally 

$ T = Z *a,a 2 (48) 

AlA 2 

As an example we want to present now the Thirring model (Kupsch, Riihl 
and Yunn, 1975 ; Riihl, 1975). We shall use the formalism by means of fields on 
M" c - The main tool in the construction of the Thirring field is the current 
operator (Dell'Antonio, Frishman and Zwanziger, 1972) that transforms as a 
free mass zero vector field. Its components /+ and /_ depend on <p+ and q>- 
respectively only and, expanded in the canonical basis, are 

1 OO 1 

/*(*>*) = - Z (w + l) 5 {cl, m e' (m+I ^ + c ± , m e 

IT m=0 f 




j ± (<p ± ) =/ ± - ) (<pJ+/ ± +) (^ ± )+-<? ± 


The operators c±, m , cl, m satisfy canonical commutation relations, e.g. 

and commute with the charge operators Q ± . We introduce the 'sources' of the 
current operators by 

itW = 7rf" ± d^/l-ty) = YtX<pJ (52) 


Then we define the Thirring field (y = 1, 2) by Riihl and Yunn (1975b) 

f y (<p) = 2 d exp 1 Z C±, y [lt\cp ± ) +§0 ± <p ± ] 


x * y exp i Z C ± , Y [ri + V*) + ±Q±V±\ (53) 


where the coefficients C±, Y are defined by 

Q ± <r y = a y [Q ± + C ± , y ] (54) 

<r y commutes with the source terms (52). 

Under a space reflection we require that the two components of a y inter- 
change and that Q± goes into Q^. This necessitates 

C ±)1 = CV, 2 :=C ± (55) 

which leaves two free parameters to the model. The spin and dimension of the 
field operator (53) are 

d = %Cl + Cl] (56) 

s=\\Cl-Cl\ (57) 


336 Uncertainty Principle and Foundations of Quantum Mechanics 

By differentiation of (53) we find four equations 

-ijfJM = ^4(^W+^o ± )/>) 
+ />)(/ ± + W+^Q ± )] 

two of which can be shown to be identical with the field equations (Thirring, 

- »a M y^(x) = gy"tf:\xMx) + ^(x)J ( ;\x)] (59) 

whereas two equations are additional. The coupling constant g is 

g = 27rC- (60) 

The operator ay is a constant field belonging to d = s = as can be seen from 
(56), (57). It satisfies abnormal commutation relations (Klaiber, 1968; Low- 
enstein and Swieca, 1971) 

{oi,o 2 } = {oi,o 2 } = 
[01, oI] = [o 2 ,o 2 ] = — 



and has the Wightman functions 

( = <0|(o 2 )" (o 2 )"]0> 

= (27r)- n S nn . (62) 

The transformation behaviour of the Thirring field (53) under the conformal 
group follows from the canonical transformation behaviour of the current, 
from the invariance of the vacuum state and the charge operators Q ± , and from 
an appropriate definition of the transformation behaviour of the <r-field. This 
takes account of the non-invariance of the subtraction point ±100 in (52), and is 
consistent with the invariance of the Wightman functions (62) and the com- 
mutators (61) (Riihl and Yunn, 1975b). We obtain 

UJ y {<p)U- 1 = n{|« ± e^+/3 ± r c H/> + *, <P- g ) (63) 

Moreover we have from (53), (54) 

/> ± + 2ir,*= F ) = e fa< H(?)e 



In fact the finite conformal transformations (63) can be obtained from the 
energy momentum tensor (Dell' Antonio and coworkers, 1972) in 'Sugawara 

©^ = v : J J, : JJ K : g„ v (65) 

by the canonical (that is: free massless field) formulae for the generators by 

Riihl and Yunn 337 

exponentiation. Among these generators we find the combinations 

T ± = I (m + l)cl, m c ± , m +^ 



that create one-parameter subgroups containing the central elements Z+ and 
Z_ of G 2 in agreement with (64). 
From (54) we see that the eigenspaces of Q ± belong to the eigenvalues 

A±(«i, n 2 ) = n 1 C ± + n 2 C T (67) 

(«i,2 = 0, ±1, ±2, . . . ). They are obtained by applying a^ n r times (respectively 
01" (-«i) times) and a 2 n 2 times (respectively cr 2 (-n 2 ) times) to the vacuum 
state and operating with arbitrary polynomials of the currents on these states. It 
follows that the operator f y {<p)Y\.(\. ± (n u n 2 )) has one covariant component only 

T ±>y ^ C^nxCt + iijO + iCi 


mod 1 


A crucial property of the Thirring model is that any product of operators f y 
or their adjoints f y can be regularized by splitting off a regular factor R 

nfyitii) YlfM) = SM*yitl<Pih W]R { yi};{y;U<Pih {*>}] (69) 

that is C°°, multifocal, and conformally covariant in all variables. The singular 
factor 5 is a covariant distribution. Identifying arguments in R leads to other 
local conformally covariant operators. Applying derivative operators to R 
before identifying arguments leads to local, but to conformally covariant 
differential operators only if the differential operator is itself covariant in the 
sense specified in the subsequent section. For the Thirring model the conformal 
analysis of operator products reduces therefore to the analysis of the reg- 
ularized products. 


Products of local operators 


and their singular behaviour if x approaches y have been studied for two 
different purposes. The first approach was motivated by phenomenology, 
namely the investigation of high energy asymptotic behaviour of certain matrix 
elements of such operator products (e.g. deep inelastic electron proton scatter- 
ing). This approach aimed at asymptotic expansions of the type 

A(x)B(y)=* I s n (u)C n (v),u=x-y,v=j(x+y) 



either for u -> ('short distance expansion' or Wilson expansion) or for w 2 -» 

338 Uncertainty Principle and Foundations of Quantum Mechanics 

('light-cone expansion'), see Wilson (1969) and Wilson and Zimmermann 
(1972) respectively, Brandt and Preparata (1971) and Frishman (1971). Both 
kinds of expansions have been studied in the framework of perturbative 
quantum field theory (Zimmermann, 1970, 1973). The singularity of the 
function s n (w ) decreases with increasing n, whereas C n (v ) are local operators. 

The second approach is fundamental (Polyakov, 1973; Efremov, 1968; 
Ferrara and co-workers, 1973; Bonora and co-workers, 1973; Swieca, 1974; 
Schroer and co-workers, 1975). If expansions of the type (70) for all local 
operators in a Wightman formalism together with all the two-point functions 
for these fields are given, then the structure of the quantum-field theory is fixed 
provided the validity of the expansion (70) is not only asymptotic but in the 
sense of weak convergence in some real or complex domain. In fact, all n -point 
functions can be reduced into two-point functions this way. 

Within a conformally covariant quantum field theory this programme seems 
to have a chance to be set up successfully. In fact, fixing a normalization of the 
local fields by any ad hoc prescription, all two-point functions are uniquely 
determined. Moreover, requiring each term in the expansion (70) to be 
conformally covariant reduces the number of local fields and restricts the form 
of the singular functions. We investigate such a programme in this section, 
considering the Thirring model as a guide. We restrict the investigation to M 2 . 

In order to derive an expansion of the Wilson-type whose terms are each 
covariant (or semicovariant, as we shall see), we intend to apply the tensor 
product decomposition theorem for the conformal group. First we project out 
covariant components A Ta {x), B T «(y) of A(x) and B(y) as explained in the 
preceding section. For these components we make an ansatz 

A T Hx)B T »(y)~ I \^zQ{x A ,x;xB,y\x(n),z)adz) (71) 

where the kernel Q satisfies the covariance constraints. Cf{z) is a covariant 
component of a local operator with 

T C =r A +r B mod 1 


Xa,b and xM denote the representations of G D involved. Finally there 
remains the problem of recombining the components (7 n c (z ) to a local operator. 
The tensor product decomposition theorem is used both to derive the kernels Q 
and the operators Cfiz), where for the latter part of the problem explicit 
knowledge of the quantum field model under investigation is presumed. 

We outline the derivation of the expansion (71) for models in M 2 by group 
theoretic arguments (Riihl and Yunn, 1975a, b, c). The first tool we need is an 
asymptotic completeness relation for covariant kernels of the second kind. We 
consider a space of C°° functions f(<p), -oo < «p < +oo with 

/(<p+27r) = e 2 ' ri y(«>), 



Riihl and Yunn 339 

with an appropriate set of norms. We denote it 3) T . It carries a representation 
X = (/', r) of SU(\, l) uc if we define 

2V(*) = l«e"+iB|*- 1 /fo„) 


with <p g as in (22), (23). If j is purely imaginary % can be completed to a Hilbert 
space with invariant scalar product 

f 2,r 

(/i,/ 2 )= fi(<p)f 2 (<p)d<p 


These representations form the principal series of SU(1, l) uc . If 

l-/=Fr^0modl (76) 

<3> T possesses invariant subspaces 9^ respectively, spanned by the canonical 
basis elements e^, q - t & mod 1, with 

±<Z = 2-/' + m, m =0,1,2,... (77) 

They carry the discrete series representations. 

Tensor products of spaces 2 T can be mapped into spaces 2> T by means of 
operators K 

which we call covariant if 

K(T x g ixT x g *) = T x g *K (78) 

Such covariant operators can be given in the form of convolution kernels. They 
span themselves a two-dimensional linear space. As a basis we take the 
following two kernels (arg (2/ sin (<p - iO)) = -arg (-2/ sin {<p + iO)) - v/2 for 

(ZnyKiixs, (p 3 \xu <pu x 2 , <Pi) 

= [2i sin §((?! - <p 2 - iO)Y l ~ h+T2 

x [-2/ sin §((?! - <p 2 + iO)Y h ~ h ~ T2 

x [2/ sin U<P2 - <P 3 - iO)P +h ~ h+h 

x [2/ sin |(<p 3 ~<Pi - iO)T i+h+Tl+T2 

x [-2/ sin |(^3 ~ <Pi + iO)T h+h ~ Tl ~ T2 (79) 

(2ir) 3 K 2 (x 3 , <p 3 \xu <PuX2, <Pi) 

= [2/ sin U<Pi~ <Pi ~ iO)T* " /l_Tl 

x [-2/ sin !((?! - <p 2 + /0)] _ ''2- y 3 +T i 

x [2/ sin \{<p 2 -<p 3 - iO)} - * + ' 3 ~ Tl-T2 

x [-2/ sin \{<p 2 -<p 3 + ioy\ + ' l ~ h+Tl+T2 

x [2i sin \{<p 3 -<p r - /0)]- J -" + ' 2+ '3 (80) 

340 Uncertainty Principle and Foundations of Quantum Mechanics 

[In Riihl and Yunn (1975a, b, c) the kernel K 2 was denoted K 3 \] Note that t 3 is 
necessarily Ti + r 2 mod 1 . 

If all representations are in the principal series, the operators are well 
defined. For other representations we obtain the operators K by analytic 
continuation in /. This method is used throughout, the spaces 2) T that are 
independent of / are particularly useful for this purpose. A kernel K may 
develop poles and zeros during this continuation. Poles may either be related 
with the appearance of invariant subspaces &? (r-type poles) or at positions 
depending only on A, / 2 , / 3 (/'-type poles). Residues of /-type poles may turn out 
to be differential operators that are also denoted covariant. They can be used to 
construct new local covariant operators from multilocal covariant operators. 

Dual covariant operators for the mapping 

K d T x * = {T£xT£)K d 


can be obtained from the covariant kernels by replacing /,->-/,, T;-»-r,- 
(/' = 1, 2, 3). Both types of kernels together satisfy a completeness relation 

(2tt) 2 


•I djuG^psJ d(p 3 

X{K d (Xl, (PU X2, <P2|*3, <P3)Ki(X3, (PilXl, <PuX2, <Pl) 

-Kfixu <Pu X2, <Pi\Xi, <P3)K 2 (X3, <P3\Xi, <Pi; X2, <p'i)} 
+ discrete series terms 

+00 "*"°° 

= £ e 2m ' T ^5(<p 1 -( P i-27rfc 1 ) I e 2 ™^8{<p 2 - <p' 2 -2nk 2 ) (82) 

where dpGt- 3 ) PS is the principal series part of the Plancherel measure (j 3 = v) 

p sh 2irp 

dfi(x3)ps = 

ch277-p + COS27TT 3 



Applying K x and K 2 to a regularized bilocal covariant operator, we obtain 
covariant and in general non-local operators (due to the integration, there is in 
addition always the non-locality from projecting on eigenspaces of T+ and TL). 
Inserting them into (82) allows us to reconstruct the original operators. Shifting 
the contour of (82) to any direction, yields contributions from the poles (four 
sequences of /-type and two of r-type poles for either variables <p ± ) plus a 
residual integral. Both with increasing and decreasing Re/ 3 the degree of 
singularity of these contributions grows. Therefore we do not obtain a 
covariant Wilson expansion. 

To overcome this difficulty we proceed as in the theory of Regge poles (Riihl, 
1969). We exploit the symmetry of the integrand in (82) under ;' 3 ->-/ 3 and 

replace K 12 by kernels of the second kind 

Ki=Q la + Q lb 
K d 2 = Q 2a + Q 2b 

so that 

Riihl and Yunn 341 



goes into 

t<p 3 {Q2bKi-Q lb K 2 ) 

under / 3 -> —j 3 . In turn we have explicitly 

Q la = aKi + b exp {-iir(j\ -j 2 + Ti + t 2 ) sign sin (<pi - <p 2 )}AT 2 


Q 2a = exp {mt(/i -j 2 + T X + t 2 ) sign sin (<p x - <jp 2 )}Qi 

_ sin 7r(|-/i+/ 2 +/ 3 ) sin ir(&+ h~ t^ t 2 ) 
sin 2irj 3 sin 7r(j\ -j 2 + r t + r 2 ) 


sinff(s+/i-y 2 +/ 3 ) sin 7r(| +/ 3 + r t + t 2 ) 


sin 2irj 3 sin w(Ji ~h +U + t 2 ) 

From (85) we see that the kernels Q a are not globally covariant but only 
infinitesimally, namely whenever sin( ( p 1 -<jp 2 )#0. We call them therefore 
'semicovariant'. For <pi-><p 2 we have asymptotically (as a distribution in <p 3 ) 

\Q la \ - const |2 sin J(<9i -^a)!"*"*" h+il+h (87) 

whence a decreasing singularity in the left half plane. 

Shifting then the contour in (82) we obtain the asymptotic completeness 

+0O 4-oo 

S e 2mV ^5(^ 1 -^' 1 -277fc 1 ) I e 2m ' T ^(? 2 -«p 2 -27rtfc 2 ) 

fcl— oo * 2 =-oo 

= (27r) 2 exp {iV(|-/ 2 + r 2 ) sign sin (<p x - <p 2 )} 
oo /_j\fc + i 

X 2 — TT— n2j 3 -k)i 3 ltgwU 3 + T 1 + T 2 ) + tgTT(j3-T 1 -r 2 )] 
k=0 K\ 

J'2ir <i 2 ir 

d<p 3 \ d<p' 3 Q la (Xu<Pi;X2,<P2\X3,<P3)S(X3,<P3\X3,<P3) 
o Jo 

x A(* 3 , <p' 3 )\xu <PuX2, *>2)U-/.(fc) (88) 


Xl = (-/ 3 , t 3 ) 


342 Uncertainty Principle and Foundations oi Quantum Mechanics 


h(k) = k-h-J2 + k (90) 

5 is an intertwining operator, i.e. a continuous operator from ®„ into 3 n that 
intertwines xl and * 3 . A is a covariant differential operator 

d<px d<p 2 ACt3,<P3lA'l.<Pi;Ar2,<P2)g(<Pl»<P2) 

Jo Jo 

= Qk( ~i — > -i—)s(.<Pi> <P2)Ui=^=^ ( 91 ) 

\ 3<pi 0<P2/ 

I I«P1 = <P2 = < P3 

where O k is a hypergeometric polynomial 

Gkfal,92)= I (-D m ( fc )(2yi-fc) m (|-/2-42)m(2/2-fc)«c-m(|-/l-<7l)fc-m 
™=n \TYl J 


The asymptotic completeness relation (88) is applied twice, once to the 
variables labelled (+), once to those labelled (-) in a regularized covariant 
operator R(q> l+ , «Pi-, <p 2+ , <P 2 -)- Of course we have first to project on a simul- 
taneous eigenspace of T + and 71. For a light-cone expansion it suffices to apply 
(88) to one variable only. In the completeness relation we have then the 
0(xl(k + ), *S(*-)» <P 3+ , <P3-)II(A + , A_) 

= 11 1 Qk±(-' » - » 1 \R(<Pi+> <Pi-> <P2+> <P2-)n(A+, A_)| vl=t = <P2± =«, 3± 

± I \ d<pi± d<PtJi 

Obviously the differential operators (91) do not depend on A + and A- One can 
show then that this way we obtain a semicovariant Wilson expansion in so far as 
the degree of singularity of the semicovariant kernels decreases with k + and fc_. 
But locality of the operators (93) is not yet guaranteed. Finally we multiply 
both sides of the expansion for R with the singular factor and get a 
semicovariant Wilson expansion for the operator product itself. 

It ought to be mentioned that in certain degenerate cases, namely if discrete 
series representations occur, the series may be truly covariant termwise (Riihl 
and Yunn, 1975b; Swieca, 1974; Schroer and Swieca, 1975). This is due to the 
fact that one of the components in (85) drops out and covariance is restored this 
way. A famous degenerate case of this kind is obtained if the expansion 
operates on the vacuum state. The discrete series representations appear then 
as a consequence of the spectrum condition (Section 2). 


The Thirring model is particularly simple in several respects. Firstly we know 
that the field operators (53) and all those local operators derived from 

Riihl and Yunn 343 

regularized multilocal products of it, have a certain simple behaviour under 
commutation with (?± (see (54)) and therefore with §Q|. Thus projecting a 
product A(x)B(y) on simultaneous eigenspaces of T+ and T- 

A(x)B(y)U(X + ,XJ) 


fixes both pairs t a± and r B± . Secondly we can regularize any product of such 
operators by splitting off a singular covariant factor, leaving a covariant bilocal 
C°° operator. Any such operator in M 2 leads to a series (a 'family') of operators 
with increasing dimension 

rf = -/ + -/- + l 

which for *f(fc±), gives 

j± = —J3(k ± ), d(k+, kJ) = di + d 2 +k+ + k- 



i.e. the dimensions increase in integral steps. In fact, our asymptotic complete- 
ness relation is a reordered Taylor expansion and each derivation enhances the 
dimension by one (within the context of a Weyl symmetry). It is known from 
perturbation theory with respect to e in a model in a space-time of 6-e 
dimensions, that this set of operators is too small for a Wilson expansion in 
general (Mack, 1973). 

Both properties of the Thirring model are not quite independent. If an 
operator product can be covariantly regularized as in (69), then this can make 
sense only if projections on eigenspaces from the right and the left fixes the 
transformation property of the regularized operator in both variables. In turn 
the transformation properties of the factors in the unregularized product must 
then also be fixed, namely in 

n(A^AL)AOt)fl(y)II(A + ,A_) 

= I n(A^, Al)A(x)II(A" + , A"_)fl(y)II(A + , A_) 


all A'L must be equal modulo one. 

We study the problem of recombining the different projections to a local 
operator first and then try iq gain an idea of how the general case might look. In 
the Thirring model 

n(A^, X'.)A(x)B(ym\ + , A_) = s(x, yMK, A')* (*, y)II(A + , A_) (98) 

Al is either related to A ± by a function depending on the operators A, B 

AL = /t ± ,A. B (A ± ) (99) 

or both sides of (98) vanish identically. Moreover t a± and t b± are fixed and 

TA ± +T B± = AL-A ± modl (100) 

344 Uncertainty Principle and Foundations of Quantum Mechanics 

Extracting the singular factor s(x, y) changes the transformation property of A 
and Bj A± , r A ±, j B± , r B± into that of R: j' A± , t' a± ,j' b ±, t b± , explicitly 


TB± — T B± ~~ t± 
Ja± = Ja± + K ± 


7b ± =7b ± + ^ (102) 

with some real parameters t ± and k ± . d x and d 2 in (96) refer to the transforma- 
tion properties of the regularized operator and ought to be primed, too. 

We write the decomposition of the regularized operator in the form (xa = 
Xa+xXa-, etc.) 
Il(\' + ,\'-)R(x,y)U{X + ,\-) 

- I \ dz dz'Q(xA,x;x' B Mlc±),z)S(x(k ± ),z\x c (k ± ),z') 

xIl(\' + ,y-)0(x c (k ± ),z')IK\ + ,\-) (103) 

As emphasized in the preceding section, the operators 0(x c (k ± ),z) have a 
meaning without the projection operators applied to them, since the differen- 
tial operator, by which it is obtained from R(x, y), is independent of the 
eigenvalues A' ± and A ± . It can be shown by explicit calculation that the 
semicovariant kernel Q in (103) (but not Q la !) depends at least in a neighbour- 
hood of x = y only on 

T A± +TB ± ^r A± +T B± ^Al-A ± modl (104) 

Of course the same holds true for the intertwining operator S. In the Thirring 
model A i - A ± is in general a function of A ± . In some cases, however, namely if 
the operator product commutes with <? ± , this difference vanishes. Whenever 
the difference is a unique value modulo one, the summation over A± can be 
performed in (103) and it results in an expansion for the operator R in terms of 
local operators 0(x c (k ± ), z). In general the summation cannot be performed 
this way over the spectrum of T+ and T_. Then the non-local components of the 
operators 0(x c {k ± ), z) cannot be recombined to a local operator. An alterna- 
tive formulation makes the kernel Q an operator such that the components of 
0(x c (k±), z) combine to a local operator. The non-locality is then carried over 
to the kernel. 

Finally we want to consider what happens to the kernel Q if we multiply it 
with the singular function s(x, y) (98). If we insert 

Q depends (besides the coordinates) solely on the parameters /i ± , j' 2 ±, k ± , and 
ta± + t b± . After multiplication with s we obtain an analogous function depend- 

Riihl and Yunn 345 

ing in the same fashion on the parameters j l± , / 2± , fc± + 2/c ± , and t a± + t b± , 
except possibly an unessential k -dependent change in the normalization. 

The asymptotic completeness relation (88) has been derived for C°° func- 
tions. It can be generalized to other classes of functions exhibiting singularities. 
This can be achieved by appropriate regularization techniques. Splitting off a 
singular covariant factor as in the Thirring model is just one method for just one 
class of functions. For this class of functions we have learnt that the whole 
family of local operators gets shifted in the dimension by a fixed amount and 
that the semicovariant kernels depend solely on the combinations A i-A ± . 

Crucial to any model in M 2 is therefore the regularization of the individual 
terms in the sum 

I n(A^, XL)A(x)Il(X" + , \"-)B(y)U(X + , A_) 


If all these terms can be regularized by extraction of the same covariant singular 
factor, we are in the same position as in the case of the Thirring model. The 
dependence on X" ± drops out completely and we can sum over it in a trivial 
fashion. A slightly more general situation arises if the individual terms in (106) 
can be expanded in a series each term of which can be covariantly regularized 
by factorization 

N 'Zs n (x,y)n(X' + ,\'J)R M 


(*,y)II(A + ,A_) 


One could call such expansion a 'covariant pre-Wilson-expansion'. It is 
assumed that the singular factors s n are independent of A"±, and that the degree 
of the singularity of s n decreases with increasing n. Each term in the pre- 
Wilson-expansion yields a family of operators 

O nX+ s_(x c n (kJ, z) 

The intermediary projection operators can be eliminated by summation over 
X" ± . Concerning the recombination to local operators we are still in the same 
position as in the case of the Thirring model. 

If, however, in the pre-Wilson-expansion, the singular factors depend on X" ± 
in a non-trivial fashion, we have a new source for non-local operators appear- 
ing in the Wilson expansion. In any case, a pre-Wilson expansion (107) always 
leads to families of operators whose dimensions are non-integrally separated. 
Of course, it is also conceivable that no families of operators occur at all though 
this seems rather natural to us. 


Bateman, H. (1910) Proc. London Math. Soc., 8, 223. 

Bonora, L., Cicciariello, S., Sartori, G. and Tonin, M. (1973) Scale and Conformal Symmetry in 

Hadron Physics, Proc. Advanced School of Physics Frascati 1972, R. Gatto (Ed.), John Wiley, 

New York. 

346 Uncertainty Principle and Foundations oi Quantum Mechanics 

Brandt, R. A. and Preparata, G. (1971) Nucl. Phys., B27, 541. 
Cunningham, E. (1909) Proc. London Math. Soc., 8, 77. 

Dell' Antonio, G. F., Frishman, Y. and Zwanziger, D. (1972) Phys. Rev., D6, 988. 
Dobrev V , Mack, G, Petkova, V. and Todorov, I. T. (1975a) JINR Report B2-7977 ; Elementary 
Representations and Intertwining Operators for the Generalized Lorentz Group, Institute for 
Advanced Study Preprint, Princeton. 
Dobrev V Petkova, V., Petrova, S. and Todorov, I. T. (1975b) Dynamical Derivation of Vacuum 
Operator Product Expansion in Euclidean Conformal Quantum Field Theory, Institute for 
Advanced Study Preprint, Princeton. 
Efremov, A. V. (1968) A model of Lie fields, Preprint P2-3731, JINR Dubna. 
Ferrara, S., Gatto, R. and Grillo, A. F. (1973) Springer Tracts in Modem Physics, Vol. 67, 

Springer- Verlag, Berlin, p. 1. 
Frishman, Y. (1971) Ann. Phys. (N.Y.), 66, 373. 

Fulton, T., Rohrlich, F. and Witten, L. (1962) Rev. Mod. Phys., 34, 442. 
Gell-Mann, M. and Low, F. (1954) Phys. Rev., 95, 1300. 
Hortacsu, M., Seiler, R. and Schroer, B. (1972) Phys. Rev., US, 2519. 
Jost, R. ( 1 96 1) 'Properties of Wightman functions', in Lectures on Field Theory and the Many -Body 

Problem, E. R. Caianello (Ed.), Academic Press, New York. 
Kastrup, H. A. (1962) Ann. Phys. (Leipzig), 7, 388. 

Kastrup, H. A. (1964) Nucl. Phys., 58, 561. . . 

Klaiber, B. (1968) in Quantum Theory and Statistical Physics, Lectures in Theoretical Physics, Vol. 

X-A, A. O. Barut and W. E. Brittin (Eds.), Gordon and Breach, New York, p. 141. 
Kupsch, J., Ruhl, W. and Yunn, B. C. (1975) Ann. Phys. (N.Y.), 89, 115. 
Lowenstein, J. H. and Swieca, J. A. (1971) Ann. Phys. (NY.) 68, 172. 

Luscher, M. and Mack, G. (1975) Comm. Math. Phys., 41, 203. ,,,<„„,„..., * . 

Mack, G. (1973) in Strong Interaction Physics, Lecture Notes in Physics, Vol. 17, W. Ruhl and A. 

Vancura (Eds.), Springer- Verlag, Berlin, p. 300. 
Mack, G. (1974) in Renormalization and Invariance in Quantum Field Theory, E. R. Caianello 

(Ed.), Plenum Press, New York, p. 123. „„„ „, ... n ■.: 

Mack G (1975) All Unitary Ray Representations of the Conformal Group SU(2, 2) with Positive 
Energy, Universitat Hamburg Preprint 1975, see this paper for the latest list of references on 
representations of the conformal group. 
Mack, G. and Salam, A. (1969) Ann. Phys. (NY.), 53, 174, and references cited there. 
Mack, G. and Todorov, I. T. (1973) Phys. Rev., D8, 1764. 
Mayer, D. H. (1974) Conformal Invariant Causal Structures on Pseudo-Riemanman Manifolds, 

Preprint Technische Hochschule Aachen, April 1974. 
Migdal, A. A. (1971) Phys. Letters, 37B, 386. 
Pohlmeyer, K. (1969) Comm. Math. Phys., 12, 204. 
Polyakov, A. M. (1969) Sov. Phys. JETP, 28, 533. 
Polyakov, A. M., (1973) Non-Hamiltonian Approach to the Quantum Field Theory at Small 

Distances, Preprint Landau Institute for Theoretical Physics, Chernogolovka 1973. 
Ruhl, W. (1969) The Lorentz Group and Harmonic Analysis, Benjamin, New York; see the 

references in this book for references to Toller's work. 
Ruhl, W. (1973) Comm. Math. Phys., 30, 287, 34, 149. 
Ruhl, W. (1975) Acta Physica Austriaca, Suppl. XTV, 643. 

Ruhl, W. and Yunn, B. C. (1975a) Representations of the Universal Covering Group of SU(\, 1) 
and Their Bilinear and Trilinear Invariant Forms, Preprint Universitat Kaiserslautern, June 
1975, to appear in /. Math. Phys. 
Ruhl, W. and Yunn, B. C. (1975b) Operator Product Expansions in Conformally Covanant 
Quantum Field Theory, Part I: Strictly Covariant Expansions, Preprint Universitat Kaisers- 
lautern, October 1975. 
Ruhl, W. and Yunn, B. C. (1975c) Operator Product Expansions in Conformally Covanant 
Quantum Field Theory, Part II: Semicovariant Expansions, Preprint Universitat Kaiserslautern, 
November 1975. 
Schroer, B. and Swieca, A. (1974) Phys. Rev., D10, 480. 
Schroer, B., Swieca, J. A. and Volkel, A. H. (1975) Phys. Rev., Dll, 1509. 
Segal, I. (1971) Bull. Am. Math. Soc., 77, 958. . . 

Swieca, J. A. (1974) Conformal Operator Product Expansions in the Minkowski Region, Pontmcia 
Universidade Catolica do Rio de Janeiro, Preprint, May 1974. 

Ruhl and Yunn 347 

Thirring, W. (1958) Ann. Phys. (NY), 3, 91. 

Wess, J. (1960) Nuovo Cimento, 18, 1086. 

Wilson, K. G. (1969) Phys. Rev., 179, 1499. 

Wilson, K. and Zimmermann, W. Comm. Math. Phys. (1972) 24, 871. 

Zeeman, E. C. (1964) /. Math. Phys., 5, 490. 

Zimmermann, W. (1970) Lectures on Elementary Particles and Quantum Field Theory, Brandeis 

University Summer Institute in Theoretical Physics 1970, Vol. 1, S. Deser, M. Grisaru and H. 

Pendleton (Eds.), MIT Press, Cambridge Mass., p. 395. 
Zimmermann, W. (1973) 'Operator product expansions', in Strong Interaction Physics, Lecture 

Notes in Physics, Vol. 17, W. Ruhl and A. Vancura (Eds.), Springer- Verlag, Berlin, p. 343. 

The Construction of 
Quantum Field Theories 


Universitat Bielefeld, Bielefeld, West Germany 


Heisenberg's uncertainty relations are the most compact formulation of the 
two-fold challenge presented by modern physics. As efforts failed to dispute 
their fundamental nature (Einstein's 'God does not throw dice' [Heisenberg 
(1969) gives a vivid eye-witness account of this struggle]) natural philosophy 
was called upon to cope with a radically new way of thinking (Heisenberg, 

Mathematical physics on the other hand was faced with the task of enlarging 
the structures of classical theory in such a way that the uncertainty relations 
would find a place in them, or more precisely to base a consistent theory of 
quantum mechanics on the uncertainty relations, with classical mechanics as a 
macroscopic limit. 

The philosophical revolution has not come to a close in the past 50 years, we 
shall not deal with it here. The physicist is reassured by the fact that the other, 
theoretical, challenge has been dealt with quite successfully. In the past 50 
years quantum mechanics has become well established as a physically relevant 
and mathematically consistent theory [cf . for example Mackey's book (Mac- 
key, 1963) for an axiomatic development of quantum theory on the basis of the 
uncertainty principle]. But not all is well. Einstein's theory of relativity, some 
20 years before the advent of quantum mechanics even, amounted to yet 
another transgression beyond the domain of classical physics. At first glance 
these new territories appear to be quite disjoint: quantum theory takes the 
place of classical mechanics in the submicroscopic domain of atoms and nuclei, 
while special relativity does so in the realm of high velocities, comparable to 
that of light. 

However, in any attempt to formulate a theory of elementary particles it is 
the uncertainty principle itself which points to the necessity of an amalgam 
between quantum mechanics and special relativity. If we insert subnuclear 
masses and dimensions in 



350 Uncertainty Principle and Foundations of Quantum Mechanics 

we find a velocity range Ap/m which is by no means small compared to the 
speed of light. 

And yet in these past 50 years and in spite of the hard work of what are now 
generations of physicists the construction of a relativistic quantum theory of 
interacting particles has not come to a close. Here we are faced with a problem 
that has turned out to be much more tenacious than its non-relativistic 

Two questions come to mind: 

'Why not give up?' 

and if this can be answered to satisfaction, then 

'Why is it so hard?' 

To answer the first one need only observe that we are dealing with two 
theories— quantum mechanics and special relativity— which are undoubtedly 
appropriate and powerful where only one and not both extensions of classical 
physics are called for, i.e. for submicroscopic phenomena as long as one may 
neglect the relativistic aspect and, respectively, for relativistic phenomena as 
long as quantum effects are unimportant. A fundamental theory — of elemen- 
tary particles, if this concept should indeed survive— must deal with 
phenomena which are at the same time submicroscopic and relativistic, hence 
the quest for such a unified relativistic quantum theory is tantamount to the 
search for a fundamental theory of matter. 

Now why is this so hard that it has defied the efforts of so many?— To clarify 
this a few generalities concerning the physical 'ansatz' are in order. Evidently 
different attacks on the problem have been based on different sets of assump- 
tions, and one has frequently criticized the following list for being overly 
conservative, until the recent successes of constructive quantum field theory 
gave indications that they are reasonable. 

As in the non-relativistic theory one assumes a Hilbert space description of 
the physical system with the (pure) states represented by unit vectors and the 
observables by a non-commutative algebra of operators. 

A relativistic space-time structure is introduced if one considers local 
observables, i.e. operators with a space-time label, which 

(1). transform covariantly under a suitable unitary representation of the 
Poincare group with a unique invariant state (the vacuum). 
(2). commute if they are affiliated with space-like regions of space- time. 

More specifically one can consider the algebras of (e.g. bounded) observables 
that are associated with given space-time regions [see Araki (1969) for a review 
of this 'algebraic approach']. For all its mathematical advantages this 
framework has not permitted the formulation of a dynamical ansatz. If on the 
other hand one tries to use classical relativistic dynamics as, for example, given 
by the equations of motion of electrodynamics as a guideline for the construc- 
tion of a quantized relativistic theory one is immediately confronted with the 

Streit 351 

concept of operators labelled by space-time points, i.e. to local relativistic 
quantum fields. 

Apart from deviations often dictated by frustration the construction of such 
fields has been the goal of many elementary particle theorists for some 40 
years. How would one go about this? As early as 1936 Heisenberg discussed 
the relevance of classical non-linear field theory for elementary particle physics 
(Heisenberg, 1939). But almost all of the pertinent research since then has 
followed a different course. A systematic understanding of non-linear wave 
equations is only now beginning to emerge (cf ., for example, Reed, 1976), and 
as recent are most efforts to base quantization on their solutions (Dashen, 

Instead, for lack of more powerful methods, non-interacting 'free' fields — 
the construction of which poses no insurmountable problems — were taken as a 
starting-point. Interaction terms modelled after those of the classical theories 
were then added to the free Hamiltonian in the hope that it might be possible to 
treat the resulting dynamical changes perturbatively. 

This program quickly ran into problems of convergence, ill-defined 
divergent expressions, etc. 'Subtraction physics' evolved first as the art of 
dropping infinite terms to obtain a finite remainder, later in a systematic way as 
renormalization theory. While this allowed precise predictions at least for 
quantized electrodynamics the questions of existence remained open. This 
was — and is — particularly serious for nuclear forces since they are so strong 
that the perturbative approximations must also fail. 

Before embarking on a more detailed discussion of Hamiltonian quantum 
field theory it is worth while to pause and — with a good portion of hindsight — 
to isolate and exhibit the sources of these difficulties. 

(A) The Infinite Volume Divergence 

Addition of the space integral of an interaction energy density to the free 

H = H +\ hi 


gives rise to an operator which cannot be finite when applied to the vacuum ft, 

H il = 

so that 

||//n|| 2 =|(n,Mx)/t I (y)n)dxd y 

with the integrand depending only on x— y because of translation invariance. 

A more refined argument leads to 'Haag's theorem' [Haag (1955), for a very 

general proof cf., for example, Emch (1972)] which says that the canonical 

352 Uncertainty Principle and Foundations of Quantum Mechanics 

variables of the problem with interaction cannot be equivalent to those 
appropriate for the free Hamiltonian. Note that 'all' representations of the 
canonical commutation relations 

[qi,q k ] = [Pi,Pk] = 0, [qi,p k ] = M8 ik i,k = l,...,n 

are unitarily equivalent (up to multiplicity and under reasonable technical 
assumptions). This important theorem of von Neumann (see, for example, 
Putnam, 1967) assures us that, for example, Heisenberg's matrix representa- 
tion of these operators will never produce results that are different from those 
calculated in, say, Schrodinger's framework, where 

<?,=*, Pk = -M— inL 2 (d"x) 

In particular, any dynamical problem of quantum mechanics can be stated in 
these terms and solved by applying to an 'initial data' function from L the 
unitary group generated by the Hamiltonian. 

Not so in quantum field theory: von Neumann's uniqueness theorem breaks 
down as the number of degrees of freedom becomes infinite. There is then a 
vast and largely unexplored set of inequivalent representations for the canoni- 
cal algebra, and Haag's theorem tells us that we have to find a non-standard one 
appropriate for the given Hamiltonian — not even the canonical algebras for 
free fields of two different masses are equivalent. We mention in passing that 
the situation is somewhat different if we formulate the initial value problem not 
on a space-like hyperplane of space-time like, for example, 

{(x, 0:' = 0} 

but on a light-like one such as 

{(*,t):x, + t = 0} 

But at this point little is known about the adequacy of such data for non-trivial 
theories cf., for example, Leutwyler and coworkers (1970) or for some recent 
results and further references Driessler (in press). That is we have to deal with 
the paradoxical situation that to state the initial values has become a non- 
trivial, dynamical question, and we have to solve it before we can even 
formulate the dynamical problem correctly. This discouraging paradoxon was 
bound to influence the directions of research. The decade from 1955 to 1965 
was characterized by the strategy to learn about field theories not by construc- 
tion but by postulating their existence and fundamental properties (locality, 
relativistic covariance, energy-momentan spectrum) as in the 'axiomatic' 
formulations of Lehmann, Symanzik and Zimmermann, of Wightman and — 
for local rings of bounded observables — of Araki, Haag and Kastler (cf., for 
example, Jost, 1965; Streater and Wightman, 1964; Emch, 1972). The con- 
structive problem was generally relegated until after the advent of some 'totally 
new creative idea, a further essential change in our conceptions of the struc- 
tural laws of matter' as one author put it in 1956. It was a fascinating episode of 

Streit 353 

recent science history to observe how, ten years later, this turned out not to be 
the case. But let us first turn to the other obstacles that field quantization had to 
cope with. 

(B) The Ultraviolet Divergences 

There were expectations in the early days of quantum field theory that 
singularities such as the infinite self -energy of classical point charges would go 
away through quantization. But quite to the' contrary virtually every second 
calculation of quantum electrodynamics included the process of throwing away 
an infinite term and interpreting the remainder as the 'correct result'. These 
procedures were formalized in the renormalization theory of Feynman, Dyson 
and Schwinger. Used as a recipe for calculations they allowed for the astound- 
ing numerical predictions of quantum electrodynamics while at the same time 
the meaning of the formal dynamical ansatz or the formulation of a meaningful 
one was further obscured. 

(C) Series Divergences 

The successes of renormalized perturbation theory as applied to quantum 
electrodynamics are even more impressive in the face of yet another type of 
divergence — series divergences. For non-linear interactions the convergence 
question of the perturbation series for, say, the Green's functions looks 
hopeless. Combinatorial considerations show a veritable explosion of the 
number of terms as the order of the perturbation increases. Also a glance at, for 
example, the quartic oscillator potential makes plausible that inverting the sign 
of the coupling constant changes the nature of the interaction so drastically that 
we should not expect analyticity in the neighbourhood of zero. 

(D) Infrared Divergences 

With these we come to the end of our list of difficulties. They arise from the long 
range of forces mediated by the exchange of massless particles. In momentum 
space these long distance problems become problems of small momenta (hence 
the name). Certain aspects can be studied in non-relativistic models: note the 
discussion of scattering from potentials with Coulomb tails. Also, in contradis- 
tinction to the other complications, this one is not intrinsic to all non-trivial 
local quantum field theories. It does not arise as long as we focus on theories 
without massless excitations, and we shall not consider it further. 

As we turn to an account of recent progress in constructive quantum field 
theory we shall aim neither for mathematical rigour nor for any kind of 

354 Uncertainty Principle and Foundations of Quantum Mechanics 

completeness— this would be quite meaningless anyway in a situation of such 
rapid progress— but instead we shall try to communicate to the non-expert how 
the main structural problems are being tackled and what one can say about the 
evolving theory. (The references were selected correspondingly.) 


A systematic exposition of the subject may be found in various texts (e.g., 
Schweber, 1961; Bjorken and Drell, 1965) we shall content ourselves here to 
present the most important concepts as generalizations of ones that are 
well-known from non-relativistic quantum mechanics to the case of infinitely 
many degrees of freedom. 

We shall begin our short dictionary of quantum field theory language with 
the canonical variables which in field theory are indexed by points in s- 
dimensional space the expressions on the left-hand side refer to quantum 
mechanics and those on the right-hand side to quantum field theory in this and 
the following examples. 

[ qi ,p K ] = iS Ki [<p(x),*(y)] = i8 (s X*-y) 

Generic variables are obtained as follows 

(<?, A) = I kfi t A e R- <p(f) = \ /(*)?« d S % Sfe 

1 = 1 

This 'smearing out' with smooth, rapidly decreasing functions has the extra 
advantage of making <p(f) a less singular operator than <p(x) is. 

The equations of motion are 
q K = i[H,q K ] = p K 
q K = i[H, p K ] 

< p(x) = i[H,<p{x)] = 7r(x) 
<p(x) = i[H,Tr(x)] 

For the vacuum state we borrow a typical property of quantum mechanical 
ground states: 

for almost all x 

tfi cyclic for the field algebra s&„ i.e. 
V^o dense in the representation 

Such cyclic representations allow for a very compact and useful description via 

E(X) = (*„, e l( ** Vo) E[f] = (%, e'^o) 

= f e^lMxtfd-x 

Jw» n 

= [ e ,(x/) dfiix) 

Streit 355 

E is the 'characteristic function' ( = E is the characteristic functional of a 

Fourier transform) of a probability probability measure /a on the vector 
measure |i^o(*)| 2 d";c on the vector space &" of distributions dual to the 
space {x} = R" dual to the A. space Sf (cf., e.g., Gelfand and 

Vilenkin, 1964; Hida, 1970). 

A prominent example is furnished by the harmonic oscillator ground state and 
its field theoretical counterpart: 


= kp,p)+kq,o> 2 q)-E 

#osc</'0 = 

so that in this case 

E(X) = e 


_ e -l/2||(9,A)*olP 

// = ljd s X :7r 2 (x) + (V,p( X )) 2 
= 2 d x : ir 4- (pw <p : 

£[/] = e - 1/4(/ ' a '" 1/) - 

_ e -l/2||*.(/)*-<JI 2 

In both cases we are dealing with multivariate Gaussian distributions of 
mean zero: their characteristic function(al)s are obtained by exponentiating 
their second moments. With 


. that (/, Co' 1 /) = f /»(*)-=!=/(*) d s k 

J ylK +m 

<o = (-A + my* so 

E[f] is the generating functional of the Fock representation of a scalar 
relativistic free field of mass m. 

<p(x)+co 2 <p(x) = 

Creation and annihilation operators are introduced through 



■-(at + a k ) 


= i\J—(a k - 

a k ) 

[«fc, at'] = S kk , 

< P {x)=^r s/2 \ 1 £=y{k)^ 

} y/2(o(K) 
+ a(k)e iKx ) 

„(x) = HIttY 5 ' 2 \ d s k^f-{a + {k) e-** ) 

-a(k)e- iKx ) 

a(k)V o = w(fc)=VF+m^ 

[a(k),a + (k')] = S is \k-k') 

In both cases the double dots : : of Wick ordering signify ordering field operator 
products such that all annihilation operators a stand to the right of the creation 
operators a + . 

356 Uncertainty Principle and Foundations of Quantum Mechanics 

In particular this procedure makes :<p"(x): a well-defined local operator in 
the sense that 

:?":(/)= j/(x):<P n «:d s * 

is densely defined or even self -adjoint for suitable n and s. 
As a consequence 

g/t,(x) = g:<p"(x): n>2 

has become the classical ansatz for the interaction energy density of a self- 
coupled scalar field. In the following we shall concentrate on models of this 
t yp e _ w hile self -interacting scalar fields may not be appropriate as a funda- 
mental concept for particle physics, they provide the simplest model for the 
discussion of the basic mathematical problems inherent in any non-trivial field 
theoretical ansatz. 


Cutoffs come to mind as a remedy of the basic problems: 'Putting the theory 
into a finite box' to avoid the infinite volume divergence, setting finite upper 
limits for momentum space integrals to eliminate ultraviolet divergences— 
these techniques have been employed from the early days of relativistic 
quantum field theory. From the theoretical point of view any such surgery 
amounts to a violation of basic symmetries and principles such as translation 
invariance or locality, from the practical point of view it was often a matter of 
luck or intuition to extract just those quantities which were not violently 
cutoff-dependent, or otherwise to find a (more or less) physical interpretation 

of the cutoff. 

It is a major and characteristic achievement of constructive quantum field 
theory that one has learned to make the cutoffs reversible, by first introducing 
sufficiently many of them to be able to construct a well-defined model and then 
controlling the limits as the cutoffs are removed in such a way that a non-trivial 
relativistic quantum field theory emerges. Evidently on the practical side much 
was learned about which quantities do not depend catastrophically on cutoffs 
and hence are amenable to approximate computation. 

The Guenin-Segal strategy [reviewed by Jaffe (1969)] presents the most 
transparent example of such reversible surgery. Its goal is to circumvent Haag's 
theorem, i.e. to deal with the infinite volume divergence of interaction Hamil- 
tonians such as 

H = H +\g:<p n (x):d s 

and it is based on one cutoff and two observations. 

The cutoff is rather obvious. Since the infinite integral over the interaction 
energy density causes the problem we reduce the latter to zero at large 

Streit 357 

distances through a space-dependent coupling g(x) 

fg>0 |x|</ 

g(x) = 

|x|>/ + £ 

with a smooth transition between the regions of constant coupling strength 
|x| < / and of no interaction |x| > / + e. We denote the modified Hamiltonian by 

The two observations that bring this cutoff under control exploit the locality 
of the interaction term and the continuity of vacuum expectation values. 

(A) Locality 

The equation of motion 

<p(x) = i[H,, ir(x)] 

is insensitive to the values of g(y) for y # x since n(x) commutes with the energy 
density at such points. Causal propagation of the field [a feature to be verified! 
(Jaffe, 1969, p. 126ff.)] then allows one to conclude that the time evolution of 
the field is insensitive to the cutoff in the causal dependence region of the 
constant coupling domain |x| < /, i.e. we have a cutoff independent solution in 
the diamond 

in which we can imbed any bounded space-time region by choosing a suffi- 
ciently large, yet finite, cutoff parameter one. But this solution of the equations 
of motion is not all that is required. For the construction of physical states we 
next invoke the following. 

(B) Continuity of the Vacuum Expectation Values 

Looking for a physical vacuum which, formally, should be given to us as the 
lowest lying eigenstate of H, we run into the following problem. Consider the 
ground states [it is by no means trivial to verify their existence (Glimm and 
Jaffe, 1970)] ft, of the approximate Hamiltonians //,: on the basis of Haag's 
theorem we cannot hope for ft, to have a non-trivial limit as the cutoff / is taken 
to infinity. However there is a subtle distinction between convergence of the 
vectors ft, in Fock space and that of the expectation functionals 

«>,(A) = (ft„Aft,) 

on the field algebra generated by the approximate vacua. The following 
heuristic argument supports this distinction: as the cutoff parameter / is 
increased, the ground state differs from the Fock vacuum (and all other Fock 
space vectors) over larger and larger regions until in the limit it becomes 

358 Uncertainty Principle and Foundations of Quantum Mechanics 

orthogonal to all of them ['van Hove's phenomenon', Guerra (1972)] 

w - lim ft/ = 


On the other hand it is plausible that the expectation value of local observables 
A changes only little if the state in question is altered at distances of the order 
of a large /, and less and less as / approaches infinity: 

lim o),(A) = (o(A) 

is expected to exist. 
There is then a well-known procedure ('GNS construction'*) to cast <o (A ) m 

a Hilbert space form: 

w(A) = (ft,Aft) 

At first sight this may be confusing. Have we found a vector ft where there was 
none before? This is not the case. Recall that we have found it impossible to 
construct the limiting vector ft in Fock space. Here it occurs as a cyclic vector of 
a field which is inequivalent to that of the Fock representation. 

One might say that by controlling the limiting state we have succeeded in 
constructing the theory. What then remains to be done is to verify the required 
properties, such as Poincare invariance [for Lorentz covariance in Fock space 
cf . Cannon and Jaffe (1970) and for a 'Euclidean' proof Simon (1974)] and the 
desirable ones, like the existence of particles (Glimm and coworkers, 1974) and 
of non-trivial scattering processes (Osterwalder and Seneor, 1975); Eckmann 
and Dimock (in press) between them. 

We should emphasize that this construction actually predicts the particles of 
the theory— the resulting representation of the Poincare group will not be 
equivalent to the original one in Fock space. In this sense, too, a relativistic 
quantum theory provides a more fundamental description of matter. The 
program that we have sketched for the construction of such theories includes 
many steps which we have barely mentioned here although they are technically 
very involved. It is a tremendously important step forward in the construction 
of a relativistic quantum theory of matter though that this program has been 
proven viable — if only for a sufficiently simplified class of models. 

It will turn out to be very instructive for us to track down the cause of such 
restrictions. Recall that we had cutoff the interaction Hamiltonian in an effort 
to obtain a finite vector when applying it to the Fock vacuum: 


If we express, for example, an interaction energy density 

h l (x) = g(x):<p n (x): 

*For this 'reconstruction' of fields resp. bounded observables cf ., for example, Jost (1965), Streater 
and Wightman (1964) and Emch (1972). 

Streit 359 

in terms of the creation and annihilation operators given in our 'dictionary' it is 
straightforward to calculate 

\\H^ \\ 2 = const, f ft-^-gfek,,) 

J v = \ W(K„) V v I 

Here g denotes the Fourier transform of the cutoff function g. Whatever its 
exact form may be the integral is finite only in a model world where the k- 
integration — and hence space — is one-dimensional. With increasing space- 
time dimensionality (and increasing interaction power n) the integral exhibits a 
higher and higher degree of divergence for large k — i.e. an 'ultraviolet' 
divergence that calls for renormalizations. 

We have found that in such cases the space cutoff Hamiltonian may not see 
the Fock vacuum, technically the latter is not in the domain of H,. Nor, as one 
can check, are any other simple Fock space vectors that one might think of 
(Glimm (1969)). 

For such singular perturbations the domains of the Hamiltonians (the vectors 
of finite energy) are sensitive to the detailed features of the interaction, its 
specific form would have to be taken into account in their construction. This 
particular problem can be attacked with the help of approximate M0ller 
operators. In non-relativistic quantum mechanics these serve to intertwine 
between the free and the interacting Schrodinger Hamiltonians, and conse- 
quently, between their domains. A viable adaptation of these ideas to the case 
at hand proceeds along the following steps: introduce a high momentum cutoff 
in the interaction Hamiltonian to make it well-defined — use Friedrichs' pertur- 
bative construction [for a review and references cf. Streit (1970)] to obtain 
approximate wave operators ('dressing transformations') — apply these 'dres- 
sing transformations' to suitable Fock space vectors to obtain state vectors that 
the interacting Hamiltonian can see — remove cutoffs to obtain states appro- 
priate for the full, no cutoff interaction. 

Technically the construction of such approximate dressing transformation 
and controlling the limits is extremely complicated, but two structurally 
interesting observations should be made before we embark on a more recent 
alternate approach. One can only hope to find intertwining transformations for 
operators with matching spectra. Friedrichs' construction actually generates 
these adjustments of the ground-state energy, mass gap, etc. These are the 
so-called renormalization counter-terms. In the limit as the cutoff is removed 
they would become infinite but as they cancel corresponding infinite ground- 
state energies, masses, etc., in the original Hamiltonian the overall renor- 
malized energy operator has a finite limit. A particularly accessible subclass of 
models is formed by those where such asymptotically infinite renormalization 
terms occur only up to a finite perturbation theoretical order. These are the 
so-called superrenormalizable models, among them the 'Yukawa interaction' 
of fermions and bosons in two space-time dimensions (Y 2 ), and the quartic 
self-interactions of scalar mesons in a three-dimensional space-time (<p 3 ). At 

360 Uncertainty Principle and Foundations of Quantum Mechanics 

present the problem of going beyond this class and of tackling models like <p$ in 
the physical four-dimensional space-time is still unsolved. 

Secondly two cases must be distinguished regarding the limiting 'dressed 
states' as the momentum cutoff is removed. In the less singular case the limits 
can be performed within Fock space. Otherwise one must proceed as with the 
infinite volume limit that we have discussed above and construct a new field 
representation from limits of expectation values. In this latter case then, the 
ultraviolet divergences alone already call for a non-Fock representation of the 
field. Prominent examples are the Y 2 and <p\ models, respectively [for a review 
cf. Hepp (1969)]. 


It has been observed frequently in various contexts of non-relativistic as well as 
relativistic quantum dynamics that the transition to imaginary time results in 
remarkable structural simplifications: one obtains the heat equation from the 
Schrodinger equation, correspondingly Wiener integrals instead of Feynman's 
path integral, better behaved kernels in the Bethe-Salpeter equation for 
relativistic two-particle amplitudes, and most importantly for us here, the 
transition from relativistic to 'Euclidean' quantum field theory brought about 
by switching from relativistic Minkowski space-time to a space-(imaginary) 
time with positive definite Euclidean metric gives us models of equilibrium 
statistical mechanics (Symanzik, 1969) which we are comparatively much 
better equipped to handle. 

The central role that this latter transformation has recently begun to play in 
the development of relativistic quantum dynamics stems from the fact that 
Nelson (Simon, 1974, Chap. IV) and K. Osterwalder and R. Schrader (Simon, 
1974, Chap. II) have given conditions under which it becomes reversible. 

In the light of this discovery it has become an advantageous, and very 
effective, approach to the construction of quantum field theories to first 
establish the corresponding Euclidean theories and as many of their properties 
as possible by means of methods borrowed from statistical mechanics, and 
finally to check that they survive the transition back to the relativistic Min- 
kowski space-time. As a recent example— among many others— we mention 
the work of (cf . the papers of Eckmann and Dimock, in press) on the existence 
of a non-trivial scattering matrix and its asymptotic series expansion. For 
in-depth reading on the 'Euclidean strategy' a monograph written by one of the 
leading experts in this field is available (Simon, 1974). In the present review we 
want to give an introductory sketch of the method and of its scope. To this end 
we need to introduce one more concept from quantum field theory, the 
time-ordered Green's functions. They are symmetric functions of n space-time 
arguments defined to equal the vacuum expectation value of the n -field 
time-ordered product of the field at space-time points 

Xi = (%i, JCoi) 

T n (x lt ...,x n ) = (il, <p(x n , . . . <p(x n )il) if Xqi >x 02 >x 03 ...>x 0n 


Streit 361 

and they are described most handily by their generating functional 

T[f] = S ~, ft f d S+1 X v f(x y ))Tn(Xu ...,*„) 

= (a,Te iivMfMds * lx -D) 

It is straightforward but very instructive to calculate this functional for the 
trivial case where <p is a free field of mass m > in Fock space so that it obeys 
the Klein-Gordon equation of motion 

(d^ + m 2 )cp(x) = 

Defining its Green's function by 

A F (x) 

(2ir) J 

-m +ie 

{Irrf 'J ■ p„p' 

one finds for the free-field functional t 

r[/] = r [/] = exp -f J dx dyf(x)A F (x ~ y)f(y) 

Continuation to imaginary times yields the functional o\f\. Minkowski inner 
products become Euclidean ones 

« 2 

p p^-p 

so that, in terms of the Fourier transform / 

< r [/] = exp-|(/,(p 2 + m 2 )- 1 ^ 

a is the generating functional of the t -functions continued to imaginary time, 
the so-called Schwinger functions S„. Their interest lies in the fact that fairly 
explicit and very useful expressions for a and the Schwinger functions can also 
be derived for an interacting field. It will be our main task in this section to do so 
in a heuristic fashion. The necessary mathematical arguments are presented in 
Simon (1974), Chapter V; as examples of recent extensions to more singular 
models such as the Yukawa model Y 2 or the quartic meson self -interaction <p\ 
in three-dimensional space-time we quote McBryan (in press). 

The crucial observation is that — in contradistinction to T \_f\-adf] is the 
exponential of a negative definite quadratic form, i.e. just like the generating 
functional of a free field at fixed time we may write it as the Fourier transform of 
a (Gaussian) probability measure on the space of generalized functions: 


= f e'<*' 

•^"(R s+1 ) 


Recall that— for finite dimensional vector spaces!— the Fourier transform of a 
Gaussian is again a Gaussian, with the inverse quadratic form in the exponent, 
i.e. formally 

d Mo (*) = const e~ 1/2( * 2 ' (p2+m2) * 2) d°° x 

= const e- 1 ' 2 * ds+1 * ( * 2+<v * )2+m2 * 2) d°°^ 

362 Uncertainty Principle and Foundations of Quantum Mechanics 

Observe the space-time integral of the Hamiltonian density in the exponent, an 
extra factor 

Jf g e- Idxhgl<x) 

ought to generate the measure dfi g appropriate to the interaction mediated by 
the Hamiltonian 

H = //„ + } d s xg/i,(x; 

Now evidently the infinite volume element d 00 * above is purely formal but if we 
add the interaction factor to the left-hand side there is at least a fighting chance 

to be well-defined since dft„ is. Jf g is just a normalization factor for the new 




Not unexpectedly there is a Euclidean variant of Haag's theorem in our way but 
by now we know how to deal with this: we cutoff the interaction strength by 
making g space-time dependent and let g-* const in the final expression. For 
our favourite model <p 2 where 

this leads to the probability measure 



d/u. [x] 

g-» const J c 

with the Schwinger function as its moments 

For these quantities then one must verify the 'Osterwalder-Schrader axioms' 
(Simon, 1970, Chap. II) which are Euclidean analogues to those of Wightman 
and which guarantee that a corresponding relativistic quantum field theory 
obeying the latter exists. 
The shorthand notation 

f-d/*o = <->o, f d 2 xh I (x) = U v 

makes the similarity with infinite volume Gibbs states of equilibrium statistical 
mechanics even more transparent: 

o-[/]= lim (e™ <r' u «)ol(<T' u °) . 

Streit 363 

The following random collection of observations is meant to serve as an 
illustration — by no means an exhaustive one — of the wealth of information 
which this analogy opens up. 

(1). The coupling constant g plays the role of an inverse temperature. High 
temperature expansions as in statistical mechanics have turned out to be 
useful to deal with the weak coupling regime of model quantum field 
theories (Simon, 1974). 

(2). Physical masses, i.e. the lowest excitations of the system, can be 
discussed effectively in terms of inverse correlation lengths. 

(3). The direct coupling of the random field at different space-time points is 
brought about by the gradient term of the free Hamiltonian. In a lattice 
approximation to where the random field is replaced by a discrete set of 
'spin' variables X t this coupling amounts to that of a nearest neighbour 
Ising f erromagnet. As a result various useful correlation inequalities can 
be proven for the Schwinger functions (Simon, 1974). 

(4). As the coupling strength is increased <p2-models exhibit phase transi- 
tions, long-range order, and the breaking of the <p -* -<p symmetry 
(Glimm and coworkers, 1975; cf. also Glimm and Jaffe, in press). The 
proof uses mean field techniques and the Peierls argument from statisti- 
cal mechanics. Here it becomes patent to what extent the Euclidean 
formulation has come into its own. 

(5). The existence of a non-trivial <p$ model has turned out to be closely 
related to the non-triviality of the four-dimensional Ising model at the 
critical point (Glimm and Jaffe, 1974; Schrader, 1975). 

With this glimpse of the final goal— namely to establish non-trivial relativistic 
quantum theories for interacting particles in four-dimensional space-time — we 
close this 'introductory review'. We have tried to display a representative 
subset of the techniques and the trends of a field that has recently seen rapid 
development. At this point there is good reason to be optimistic about the 
emergence of relevant models for the subnuclear structure and interaction of 
matter. With this goal in mind the impressive amount not just of abstract 
existence proofs, but beyond these of structural insight and of sound computa- 
tional techniques inherent in the recent development of constructive quantum 
field theory acquires a particular relevance. 


Axaki, H. (1969) in Local Quantum Theory, R. Jost (Ed.), Academic Press, New York. 

Bjorken, J. D. and Drell, S. D. (1965) Relativistic Quantum Melds, McGraw-Hill, New York. 

Cannion, J. T. and Jaffe, A. (1970) Comm. Math. Phys., 17, 261. 

Dashen, R., Hasslacher, B. and Neveu, A. (1974) Phys. Rev., D10, 4138. 

Eckmann, J. P. (in press) in Quantum Dynamics: Models and Mathematics, L. Streit (Ed.), 

Springer, Vienna. 
Dimock, J. (in press) in Quantum Dynamics: Models and Mathematics, L. Streit (Ed.), Springer, 


364 Uncertainty Principle and Foundations of Quantum Mechanics 

Driessler, W. (in press) 'On the structure of fields and algebras on null-planes I, II; Acta Phys. 

Emch, G. (1972) Algebraic Methods in Statistical Mechanics and Quantum Field Theory, 

John Wiley, New York. 
Gelfand, T. M. and Vilenkin, N. Ya. (1964) Generalized Functions, Vol. 4, Chap. IV, 

Academic Press, New York. 
Glimm, J. and Jaffe, A. (1970) Ann. Math., 91, 362. 
Glimm, J. and Jaffe, A. (1974) Phys. Rev. Lett., 33, 440. 
Glimm, J. and Jaffe, A. (in press) in Quantum Dynamics : Models and Mathematics, L. Streit (Ed.), 

Springer, Vienna. 
Glimm, J., Jaffe, A. and Spencer, T. (1974) Ann. Math., 100, 583. 
Glimm, J., Jaffe, A. and Spencer, T. (1975) Comm. Math. Phys., 45, 203. 
Guerra, F. (1972) Phys. Rev. Lett., 28, 1213. 
Haag, R. (1955) Dan. Mat: Fys. Medd., 29, no. 12. 
Heisenberg, W. (1939) Z. Physik, 113, 61. 
Heisenberg, W. (I960) Sprache und Wirklichkeit in der modemen Physik in Gestalt und Gedanke, 

Folge 6. 
Heisenberg, W. (1969) Der Teil und das Game, Chaps. 5-1 1,17, Piper, Munich. 
Hepp, K. (1969) Theorie de la Renormalisation, Springer, Berlin. 

Hida, T. (1970) Stationary Stochastic Processes, Section 4, Princeton University Press, Princeton. 
Jaffe, A. (1969) Local Quantum Theory, R? Jost (Ed.), Academic Press, New York. 
Jost, R. (1965) The General Theory of Quantized Fields, Amer. Math. Soc, Providence. 
Leutwyler, H., Klauder, J. R. and Streit, L. (1970) Nuovo Omenta, 66A, 536. 
Mackey, G. W. (1963) Mathematical Foundations of Quantum Mechanics, Benjamin, New York. 
McBryan, D. A. (in press) Quantum Dynamics : Models and Mathematics, L. Streit (Ed.), Springer, 

Osterwalder, K. and Seneor, R. (1975) 'The scattering matrix is non-trivial for weakly coupled 

P(<p) 2 models'. Preprint. 
Putnam, C. R. (1967) Commutation Properties of Hilbert Space Operators and Related Topics, 

Springer, Berlin. 
Reed, M. (1976) Abstract Non-linear Wave Equations, Springer, Berlin. 
Schrader, R. (1975) 'A possible constructive approach to <p$ I, IP. Berlin preprints. 
Schweber, S. S. (1961) An Introduction to Relativistic Quantum Field Theory, Harper and Row, 

Simon, B. (1974) The P(<p) 2 Euclidean (Quantum) Field Theory, Princeton University Press, 

Streater, R. F. and Wightman, A. S. (1964) PCT, Spin and Statistics, and All That, Benjamin, New 

Streit, L. (1970) Acta Phys. Austriaca Suppl. VII, 355. 
Symanzik, K. (1969) in Local Quantum Theory, R. Jost (Ed.), Academic Press, New York. 

Classical Electromagnetic and 

Gravitational Field Theories 

as Limits of Massive Quantum Theories 


The Johns Hopkins University, Baltimore, Maryland, U.S.A. 


The correspondence principle in quantum mechanics states, inter alia, that as 
Planck's constant h approaches zero the theory must approach the correspond- 
ing classical theory. This principle is meaningful if there exists a classical theory 
which corresponds to the particular quantum theory. If we examine quantum 
field theories we can apply the principle only to 'massless' theories, i.e. to field 
theories which on quantization describe particles of zero mass. One can see this 
in many ways. The simplest is to notice that a field equation involves derivatives 
of the field to which one must add a 'mass term'. Dimensional arguments 
require that this term be proportional to powers of mc/h (the inverse Compton 
wavelength, m being the mass). We see that taking the limit h -> with m kept 
fixed is completely different from the limit m -> and then h -> 0. Accordingly, a 
classical field theory of particles requires taking the m -* limit before the h -> 
limit, i.e. a classical field theory of particles, of necessity, describes massless 
particles. In fact as m -> the parameter h disappears from the field equations. 
Two familiar examples of classical field theories are the electromagnetic 
(Maxwell theory) and gravitational (Einstein theory) field theories. We can 
interpret the Maxwell field (or photon field) as a relativistic field describing 
particles of mass, zero, and spin, one. The Einstein (or gravitational field) is a 
relativistic field describing particles of zero mass and spin two. One might 
expect that these classical field theories may be the limit of quantum field 
theories which describe massive particles of spin one and two. This problem has 
attracted some attention recently (Boulware and Deser, 1972; van Dam and 
Veltman, 1970). One examines quantum field theories describing massive 
particles of spin one and two coupled to sources and then performs the m -» 
limit. The question is to discover whether this limit is smooth, i.e. does this 
limiting theory give rise to the same experimental consequences as the corres- 
ponding field theories describing massless particles of spin one and two coupled 


366 Uncertainty Principle and Foundations of Quantum Mechanics 

to the same sources, respectively. That there may be some problems connected 
with the m -» limit we can see from the properties of the representations of the 
Poincare group. Those irreducible representations which span the space of 
massive particle states also have a spin parameter, s, with degeneracy 2s + 1, 
i e a particle of mass m and spin s has Is + 1 degrees of freedom. However, the 
irreducible representations corresponding to a massless particle also has a spin 
parameter s but only two degrees of freedom*. The implication of the above 
remarks is that the Hilbert space of physical states describing particles of mass 
m * and 5 ^ 1 is somehow larger than the Hilbert space for the corresponding 
massless particles. Since degrees of freedom cannot disappear, the resolution 
to the problem must be in the fact that either the m -* limits are not smooth 
(i e the two theories are different) or that the 'disappearing' degrees of 
freedom decouple or both. In this article we examine carefully the m ^ limits 
for spin one and spin two field theories to see what happens to the structure of 

the theories. 

Most of the results have been obtained previously by Boulware and Deser 
(1972) and by van Dam and Veltman (1970). What we do in this article is to 
approach the problem by using different techniques. In Section 2 we examine 
the equations of motion for spin one and two fields in order to see how the 
m *0 and m = equations each describe the correct number of degrees of 
freedom. In Section 3 we solve the equations in the presence of sources by 
finding the propagators. We again compare and contrast the solutions for the 
massive and massless cases in order to see if and why the massive solution 
approaches the massless case. In Section 4, we find those Lagrangians which 
lead to the required equations of motion. We also make use of the Lagrangian 
to find the commutation relations for the independent degrees of freedom, in 
order to see again if the 'disappearing' degrees of freedom do or do not have 
smooth limits. In the Appendices we outline some of the projection operator 
techniques used in the paper. 


In this section we discuss the equations of motion of massive and massless fields 
of spins one and two in the presence of external sources. We will demand 
ultimately that these equations be derivable from a Lagrangian. Therefore if 
the field for spin one is a vector field A M , its source ; M , is also a vector and if the 
field for spin two is a symmetric tensor field h^, its source T„. v , is also a 
symmetric tensor. We discuss two problems in this section; (a) how the 
equations of motion for a vector field with four components describe only three 
dynamical variables for mass m * and two dynamical variables for m = and 
(b) how the equations of motion for a symmetric tensor field with ten compo- 
nents describe only five dynamical variables f or m * and two dynamical 
variables for m = 0. 

*In addition to the operations of the Poincare group we include the spatial inversion (or parity) 
operation. Of course for spin 5=0 there is only one degree of freedom. 

Feldman 367 

We can write down the equations for both spins uniformly by making use of 
the Levi-Civita tensor density e^ yAp with the usual antisymmetry properties*. 
Define the second order differential operator 

DP " 1 " ;== vfftrr -j -ja 

apk — £(jiapk£ 0„(f 






We can write the equations for spin one and two by operating with D on either 
A^ or h^ and saturating enough indices so that the resulting tensor transforms 
like A^ or h^ v respectively. Thus for spin one we form 



and for spin two 




The mass term will be proportional to the A^ and /i M „ respectively. Accordingly 
for spin one we have 

jP%\A fi {x)-m 2 A a {x)=j a {x) 

and for spin twot 

^D^hy < x) + m\hi(x)-aSlh{x))=n(x), 




YjZOJfr) + m\hl (x) - 5f h(x)) = 1*(x), 



and a, at present is arbitrary. 

h = K, 


*We shall use units such that h = c = 1. The metric t)„„ has only the diagonal elements (1, -1, —1, 
-1), (i, v = 0, 1,2, 3, i',/ = l,2, 3 and e i23 = +1. 

tNote that we can write the equation for a spin zero field <f>(x) as 




-D%*4>-m 2 <l>=j 


vatpA _ f>v 

368 Uncertainty Principle and Foundations of Quantum Mechanics 

If we make use of the following identities 

1 w3<rA _ c vfrr _ rsysPs? + Sf Sf g" 

+ spX - sffifi - s;srX - Wffi) 



e '**=-8'* = -(8~X- 8 i S 2 



we see that equation (5) for m = is just Maxwell's equation and equation (6) is 
the Pauli-Fierz equation (Fierz and Pauli, 1939) for massive spin two. 

Now we must see how many dynamical variables appear in equations (5), (6) 
and (6a). Any component of A M or h^ say & and its time derivative drf can be 
assigned arbitrarily on some constant time surface, say r = 0. A dynamical 
variable will be those components of A M and fc MF which appear in the equations 
involving second time derivatives. If in equation (5) we set the index p -0, 
using (1) we find v * and thus A appears in the equations as a zeroth or first 
time derivative. We have then that only the A, are dynamical variables and that 
the equations of motion determine A in terms of the A,, We apparently have 
not used the fact that m * 0. However if m = 0, equation (5) is invariant under a 
set of transformations— gauge transformations* 

A^Ap+d^A (10) 

One may see this trivially by using (1) and the antisymmetry property of e . 
We can choose A (i.e. find a gauge) so that one of the apparent dynamical 
variables A, is identically zero. We are left with two dynamical variables for 

m = 0. 

Another property that follows immediately from (5) for m = is current 
conservation. That is, if we take d a of equation (5) we must have, for m - 0, 



This is a consequence of the equations of motion— it is not a separate equation 
of motion. Again, it is proved trivially using the properties of e^^. For the case 
m ¥> 0, we may be able to choose sources such that (11) follows, but in this case 
(1 1) will be an additional equation of motion. 

We can now carry out the same procedure for the field /i M „. In equation (6a) if 
we set the index a = 0, and using (1), v * and therefore the four h% cannot be 
dynamical variables. We are left with the six h tj as possible dynamical variables. 
Consider, now, equation (6a) with a = /3 = 0. It reads 

F(b i d i h l k ) + m\{\-a)hl-ah i ^= T% 


The derivative terms F are only spatial derivatives of the hy. Accordingly, if we 

choose .... 

a = l (I 3 ) 

•We assume of course that j„ is gauge invariant. 

Feldman 369 

equation (12) is a constraint equation on the six h tj and we are left with five 
dynamical variables, the number required to describe a massive spin two field. 
The resulting equation, (6) is indeed the Pauli-Fierz equation (Fierz and Pauli, 

Let us now turn to the case m = for (6). Of course (12) with m = again 
reduces the six h tj to five independent variables. Again, as for spin one, 
equation (6) for m = is invariant under a set of transformations — gauge 


' h^+d^A,, 

+ d„A„ 

which one deduces trivially from the properties of D. We can choose A, such 
that three of the remaining five h t j are identically zero. This leaves us with two 
dynamical variables for m = 0, as required. Again it follows from the equations 
of motion that 

3^ = 


if m = 0. This leads to the well-known problem of the consistency of these 
equations, if we identify 7^ with the energy momentum tensor of matter and 
radiation and h? a with the gravitational field. This Tf cannot be conserved and 
we must add to it the energy momentum of the gravitational field itself which 
then leads to equations non-linear in the h„. We can use this technique to lead 
us to the full Einstein equations for the gravitational field. See Deser (1970) for 
references. In this work we shall restrict ourselves to the linearized version. In 
doing so we are in effect assuming that Tf is proportional to some small 
coupling constant / and that we work only to 0(f 2 ), in which case the matter 
and radiation energy momentum tensor will be conserved. 


In this section we shall obtain the propagators for the classical fields by using 
the projection operator techniques outlined in Appendix A. It is conve- 
nient to take the Fourier transform of the equations of motion (5) and (6) and so 
work in momentum space. After this transformation and making use of the 
identities (8) and (9) we can write the equations of motion for A^(p) and h^ip) 
the Fourier transformed fields as follows: 




K%h aP (p) = -T^(p) 


K^-DZ\{p)-m z 8 

2 c x 

(p 2 -m 2 )8^-p^ 


370 Uncertainty Principle and Foundations of Quantum Mechanics 


J-[{p 2 -m 2 )8 a X-{8lp v p^8ip vP a ) + ^p a p P +P,jvf ' 

where we have assumed h^ and T^ are symmetric tensors. If the tensors K 
have an inverse we can solve equations (16) and (17) for A^ and h^ respec- 
tively, to read 

A M = G V J V 



where in both cases 



h^ v — (jpv 1 ap 

G = K~ l 



0^ = 2(S^+#) 

Since A„ is a four-vector field we can find G^ by writing Kl in terms of its spin 
one and"spin zero projection operators. These are easily found to be 



and we have 


We can write 

p»=pj{p 2 f 
(p m f = p m 

pd)p(O) _ Q 

K" = [(p 2 -m 2 )P m -m 2 P«X 







Feldman 371 

The inverse of K follows immediately giving 

1 _m 1 

G -G 




/„ p -m \ m I 


Again this shows that we are discussing a massive theory of spin one since 
only the spin one components have a pole in p 2 and therefore propagate in 
time — i.e. they are dynamical variables. 

We see also that the limit m -> is not straightforward. In fact for m = the 
operator K^, does not have an inverse since it is proportional to a projection 
operator. This is precisely the manifestation of the gauge invariance of the 
massless theory. To solve our equations (16) for m = 0, one normally 'goes into' 
some particular gauge, i.e. we modify the equations of motion so as to 
introduce an operator K^(X) which does have an inverse. The simplest set of 
gauges are the covariant gauges which depend on a parameter A. We define 

Ar;(A)-( P 2 p (1) +^F (0 >)" 

which does have an inverse, which is 

?;(a)=(V 1) +ap (0) )" 

p I* 



Of course, gauge invariance implies that any physical result must be indepen- 
dent of the gauge, i.e. independent of A. 

We saw before that as a consequence of the equations of motion for the 
massless theory the current j^ must be conserved which means 

P%(p) = (35) 

Substituting (34) into (20) and using (26) and (35) we can write for the case 
m = 

P m " i (p) 

P P 


Any classical experiment which will involve the interaction of two sources — say 
j™ and j i2) will depend on 

■(1)H A (2)_ J Jp 
1 A » ~ p 2 


and is indeed independent of the gauge. 

We return now to the massive case and if in addition to the equations of 
motion (16) we postulate that the source is conserved — i.e. we assume equation 
(35) as a field equation we can write for the case m^O 


p 2 — m 

1 —p^u(p)=-¥ e1 



372 Uncertainty Principle and Foundations of Quantum Mechanics 

Assuming m is very small (specifically the Fourier components are such that 
m 2 «p 2 ) we find 


This completes the proof, that as far as physical observations are concerned, 
the theory of a classical spin one field for small mass approaches the results for a 
massless spin one field. 

Let us now turn to the spin two case. Since h„ v is a symmetric tensor with ten 
components, we proceed by writing K* in terms of its spin two, spin one and 
two spin zero projection operators. In order that we may find the inverse of K 
easily we must find those two spin zero projection operators which are 
orthogonal. This is carried out in Appendix A and we can write 

Kt = [(p 2 -m 2 )Q (2) -m 2 Q m + X(p)Q? } + Y(p)Qf*C ( 40 > 

where Q (2) , Q m , X, Y, Q? ) and Q< 0) are defined in the Appendix. Neither X(p) 
nor Y(p) vanish so that we can invert K to find 

r"f- /-J— r> <2) -— O a) +— !— Q (0) +— !— O' 0) V (41) 

G -"-\p 2 -m 2Q m 2 ° + X(p) Uc+ Y{p) Uc )»„ 

Using the results of the Appendix we can also write 

G ^ = 2Tp^m r )\\ 8 ^~3 r '^ J \ 2m 2 ) 

3m 3 m J 


Since neither X(p) nor Y(p) vanish we see that only the spin two components 
have a pole in p 2 and thus only the five spin two fields are dynamical variables. 

Again we see that we cannot take the limit m -»0 in (41). As before, for 
m = 0, K is a combination of projection operators which do not span the space 
of symmetric second rank tensors. Therefore K does not have an inverse. In 
fact we find f or m = 


K°t = (p 2 Q (2) -2p 2 Q (0) )Z 


Q^ = ^-/W(V*-p a p'') 


This is just a manifestation of the gauge invariance of the theory. We can solve 
for the field ft M „ by 'going into' a gauge. This means we modify the equations of 
motion so that the modified K will have an inverse. The simplest class of gauges 
are the covariant ones and we write 

K&Lo. Ax) = {p 2 O i2) -2p 2 Q i0) +J-Q W +l ) O X, (45) 

where Q° must be orthogonal to Q m and is 

oil p =p„p v p p 

The inverse of K is 

Feldman 373 


G;!(A ,A 1 )=( J 5 o <2) -Ao <0) +A 1 o (1) +A d ) c 

\p 2p /, 

Substituting for Q (2) and Q m from the Appendix we have 

G£(A„, A,) =^{(Sl-pJ"M -P,P P )-\^ V -MM'* -P"P P ) 


+ (a**/B)j 

+ (A 1 O (1) + A d ^ 


The result of any observation must be independent of the A,. If we have two 
sources T^ and T^l, their interaction is proportional to 

Tr(l)A*''I,(2)_ rp{V)lJLVf~,aPrj4Z) 


We saw that for a massless theory the source must be conserved as a conse- 
quence of the field equations, i.e. we must have 

P a T afi =Q 
Now from the properties of Q (1) and Q° we have 

so that we may finally write for the interaction of two sources 

2 l aP 



Let us now return to the massive case. We find in the Appendix that for small m 

y=-2p 2 


v~ 3m 

X + Y ~3m 4U 

So that we have for m small (5*0) 

\p m 3m I 




374 Uncertainty Principle and Foundations of Quantum Mechanics 

If we choose sources such that equation (50) is an equation of motion, then 
using equation (51) we have for the interactions between two sources when the 
spin two field has a small but non-zero mass 



which using (51) again gives 

1 2 i aB 


This result can only be the same as (52) if the sources are traceless. This is not 
normally the case. The energy momentum tensor for electromagnetic radiation 
is traceless while it is not for matter. This would give rise to a discrepancy in the 
bending of light experiment if gravitation were a spin two, small mass theory. 
[See van Dam and Veltman (1970) and Boulware and Deser (1972).] 

By comparing equations (56) and (47) we see how we could modify the spin 
two massive theory to give the same results as the spin two massless theory. We 
need only add in a spin zero particle which in the limit of small m will contribute 
the term 


2p 2U ^ 

to the propagator. This is most easily accomplished by choosing the a in 
equation (6a) not equal to one. We saw for a * 1, equation (6a) is an equation 
for six dynamical variables, one of which will be the extra spin zero particle. 
However, we see from the relative sign between the (2) and Q <0) term that the 
extra spin zero particle must be a ghost. 


In this section we construct those Lagrangians which lead to the equations of 
motion (5) and (6). We do this in order to find the variables canonical to the 
independent dynamical variables. Having done so, we are able to pass to the 
m -*• limit in order to see what happens to the apparently vanishing degrees of 

We shall discuss the spin one case first. Given the equations of motion (5) one 
can easily find a Lagrangian from which they are derived. We may write* 

2(x) = \A a tfAp +WA a S p a A $ +j a A a (59) 





fiap\ c 

^ vfipk ~ 




*This is the usual Lagrangian with kinetic energy term -iF^JF"" where F„„ = d u A v - d^A^ 

Feldman 375 

This Lagrangian is not unique. By using the antisymmetry property of the 
Levi-Civita tensor we can add total derivatives to the Lagrangian by adding 
any multiple of* 


I A a A afi A l 






In fact there is no need that this extra piece be Lorentz invariant since it does 
not contribute to the equations of motion. More generally one can add to the 
Lagrangian £(x), J£ K (x) where 




where K a , k' b are any set of parameters. 

Since we have already discovered that the field A is not a dynamical 
variable, we will choose that Lagrangian such that the variable canonical to A , 
namely II is identically zero. Now 


8£ T 

We have 

" Sd°A a 

£ T = £+£ lt 


If we choose 
we find that 

Sff-A" 21 ' tapA u v™B 

= -(d„A a -d a A M ) 


n =o 

n,- = -(A,.-d,A ) 






We have of course been using the summation convention for summing over repeated indices. 
However, since in what follows we shall be writing down non-covariant additions to the Lagrangian 
we now specifically indicate summations where needed. 

376 Uncertainty Principle and Foundations of Quantum Mechanics 

Since A is a dependent variable we use the equations of motion for A to find 
Ilj entirely in terms of A ( and A,. 
The equation of motion (5) for a = gives 


and finally 

A _ *Ai _ 


m 2 -V 2 m 2 -V 2 







The inverse of A is 


A, = ^-^r 

-i ^3, 



For m = 0, we see that A does not have an inverse and is in fact a projection 
operator. It is precisely the helicity one projection operator. This indicates that 
for m = there are only two canonical momenta which are the momenta 
canonical to the two helicity one dynamical variables. 

We have been assuming that j„. does not depend on the A^ in which case ; 
will commute with the A,. Accordingly, as far as the canonical commutation 
relations are concerned we can replace the II, by 

nr=-A f/ A y 


We will drop the superscript c whenever there is no confusion. The canonical 
commutation relations are 


or using (76) 

(A i (x),n / (y)) = J8;6 3 (x-y) 

(A'(x), A,-(y)) = -i(s;-^)s 3 (x-y> 


Of course we can write (79) only in the case when m # 0. 

We can see what happens in the limit m ->• by writing (78) separately for the 
helicity one and helicity zero subspaces. 

In Appendix B we show that we can write 

Al =A\ n + A?» (80) 



A? = RfA>, Af = RfA j 


RW = 


Feldman 377 



where R (l) and R (0) are the helicity one and zero projection operators, 
Now, from (77) and for m small 

u < V^* V 2 / V 2 V 2 ' 



The commutation relations (78) can be written 

(Ar ) (x),A) 1) (y)) = -«< 1) 5 3 (x-y) 


Let us define 


[A,<°>(x),^Af(y)] = /i?f5 3 (x-y) 



di<f> = -mA\ 





Equation (86) becomes 

(4>(x),<i>(y)) = i8 3 (x-y) 

The commutation relations (85) are precisely those satisfied in the m = case 
by the independent helicity one fields. The canonical variables for the helicity 
zero are $(x) and <f>(x) given by (87). This is verified by (89). 

We now look at the equations of motion satisfied by AJ 1 ' and <$>. From 
equation (5) for a = i we have 


UAi-didjA' -d,A° + m 2 Ai = -ji 


= 3^ 

378 Uncertainty Principle and Foundations of Quantum Mechanics 

We substitute for A from (72) to obtain 


We saw that for the case m = current must be conserved and 

A,-*!," (93) 

the helicity one projection operator, so that for m = we have 

DA^-yl" (94) 

and from (72) 

A ° v 2 V 2 

For m ^ but small we have 


A -7?(1) '"■ ffCOl 


We substitute into (92) and project out the helicity one and zero parts and use 
(18) to obtain 

DAi 1) = -y1 1) (97) 


m V 


If in the massive case we assume current conservation as a further equation of 
motion we find as m -*■ 

D4> = 


Accordingly, as m -*■ the helicity one modes satisfy exactly the same equations 
of motion and commutation relations as the helicity one modes in the massless 
case. The helicity zero mode is, however, decoupled. This is the sense in which 
the third degree of freedom disappears. 

We now turn to the case of spin two. From the equations of motion (6) we can 
deduce a Lagrangian, namely, 

2(x) = fr%A^-\m 2 he(8Z8e-8&X+TZh% (100) 


A£=S"^e ^ 


Feldman 379 

Of course as in the spin one case this Lagrangian is not unique and we can add 
terms of the form 

where A may be 


2 ,L "a^A hptjK a Kp K p Kfr 


A" *" = d»e^ x e°"' yX d''ri f> " 

d*~ix atByX. t.v per 




As in the spin one case we choose that Lagrangian such that the variables Ilo 
canonical to the redundant fields h° p are identically zero. Consider 

where J£(x) is given by (100). This gives 



82 _ 


c ^0; a ~ 2E0ap\E Ojla- 

OO Ho 

— 2~(8Z8 s r — 8 s a 8")dnhs8a 

OtJj j . - r i r\ 

S M, a = 2(3 a « r - ^rrt J 

od ho 





Accordingly, we must add to !£{x) some terms of the form (102) which will 
insure that 



We may do this if we add to ££{x) 


= (.srh a ^h a v -d v h aix d^K) 

+ (d a h a0 d h r r -d h a0 d a h^ 

<e T (x)=<£{x) + 2 K {x) 


380 Uncertainty Principle and Foundations of Quantum Mechanics 

we find, 

110 Sd°h° a ° 
The other six canonical momenta will be given by 

U ' 8d°h) 

From (105) we find 

^=a M-W*£+Wg-5(d,V W/t,o) 
Sd hj 

From (112) we obtain 


Sd hj 

Tin = hi + VijWrh r °-tid- (dfro + djHio) 






We eliminate the dependent variables, h i0 using the equations of motion. In 
equation (6) we set the indices a = i and /3 = and using (8) we obtain 

(-V 2 +m 2 )A^ r0 = T i0 +dM-dih r r (H8) 

where A, r is given by (75). 
Equation (6) with a = P = gives the constraint equation on the six % 


\uh" = 



Now multiply (118) by A} 1 and use (119) to obtain 

*« = -raf 9ti+ T 0i +^3,0^°)] (120) 

AM —V L /« -I 

We substitute /i , into (117) to obtain 

n iy = A 1 >A /s /i"+A^;+^^[(a,To / +a / To I )+^a 1 a/(^^ )] (121) 

In writing our commutation relations we will assume that T„. v is indepen- 
dent of ha and tig. Since the h u are restricted by the constraints (119), we must 
find the n - which are restricted to the same subspace. We make use of the 
helicity projection operators defined in Appendix B. A general six component 
tensor h it can be decomposed into its helicity components: two helicity two, two 
helicity one, one helicity zero and a second helicity zero. 

hii = (R m +R m +R i0) +R l0) ^h rs (122) 

Feldman 381 

If we choose i? <0> to be the helicity zero projection operator, 

A,-,A ra 

(0)rs _ **■>}' 

DW> rs = 

A m „A" 


we may write using (119) 

h ii = (R w +R w +R m ) r iJh rs + 

■ oo 




m „A V -m 
Let us define the projection operator P" such that 

We may write the canonical commutation relations as 

[Mx),n re (y)] = /P£S 3 (x-y) 


U..=p"} n h 1 
rt ij ± i] Tl mn I 

n„=/>3i« J 

Equivalently, we may write the commutation relations in the various helicity 
subspaces as 

[^f ) ,m? ) ]=/i?^5 3 (x-y) (128) 


a =0,1,2 


h^ = R^h rs , etc. (129) 

Using the results of Appendix B, we can find the commutation relations for 
small m. 
Using (B.ll) we may write (in coordinate space) 



hP = ^0ti n +art n ) 

h? i —a{r h .+^)h' 

a'Ai" = o 




so that the h \ are the helicity one components of a spin one field. Also 

m; } =-h\ 



382 Uncertainty Principle and Foundations of Quantum Mechanics 

Substituting (B.31) into (128) we have 

Now take (d/dx,)(d/dy s ) of (134) and we obtain 



-Jim (1) V2m 
Ai=-=T-hi = — =7-3, 


(Mx), A r (y)) = -i(r,ir + d -^f) 8 3 (*-y) 




where R? is denned by (82). Using (B.34) and (B.25) we have for small m, 


Similarly from (B.33) 

h fjj^> h ^ 

h (0) = d -£h rs 

n " ~2\V 2 ) V 2 " 
Substituting into (128) and projecting out with S <0) we have 

[cw,|(^) 2 ^(y)]-^« 3 ('-y) 

We now operate on (142) with 





a j__d d_ 

dXj dXj dy r dy s 

to obtain 

[vV, |(^)V^°] = /W(x-y) (143) 

Feldman 383 


and (143) becomes 


[<f>(x),4>(y)] = iS 3 (x-y) 


Of course the commutation relation (128) for a = 2 is the same for both m = 
and m # i.e. 

(h?,Il i2) ) = iR%8 3 (x-y) 


As m -* the dynamical variables associated with helicity one and zero are the 
fields A t and <l> defined by (136) and (144), respectively. Next we find their 
equations of motion. Operate on equation (6) with d a d e which gives 

(d a d p h e a -Dh) = — d"d l3 Tt 


Next take the trace of (6) and use (147) to give 

*:-^'- 2 





3w"\ " m 
We use (147) and (148) in equation (6) when a = /, fi =/ to obtain 

Oka - dtdji y- djdji "i+m 2 ^, 

~ Tii+ Jm i \ T ^~m rT ) A K+ m 21 ) 
We substitute for h oi using (120) giving 

(a + m 2 )(A ir A is - ( ^2 )2 ) h rs 

_( did,d r d s \ djd/ f _ d„d^ v \ 

-\ AirAis ~(m 2 -V 2 ) 2 ) T + 3m 2 V» m 2 ' J 

Will rpll. , "lt"V rp^lA 

3V" m 2 ' / 


Using the formula for A, r A /s given by (B.24) the left side of equation (150) can 
be written for small m as 





R w- S ^+S 

/ iirs 


384 Uncertainty Principle and Foundations of Quantum Mechanics 

We may now take projections of (150) to obtain the equations for the various 
u.v.^, km* Th* S° nrniection will iust give the constraint equation (119) in 

we may nuw w^ piuj^uv.^ — v--~, - , 1im 

helicity fields. The 5° projection will just give the constraint equation (119) 
the m ■* limit. Since 

Rf rS d r d s =Rf rS Vrs=Rf rS dr=0 

we have 




T< 2 >— n(2)rs T 
1 ij — -TV ij 1 rs 





R\j )r %d s =R ( i )rs Vrs = 
and equations (133) and (136) we have for small m 

si 2 V 2 L rn x 

m li 

If we choose sources which are conserved i.e. 

^7^ = 
then as m -»0, equation (156) becomes 

DA, = 

This means the dynamical variables corresponding to the helicity one fields 
become decoupled in the m -> limit if the source is conserved. 

We now project out the helicity zero field from equation (150), by multiply- 
ing the equation by S (0) , taking the trace and using the definition (144), to 
obtain for small m 

1 (159) 

Again, assuming (157) and letting m -» we find 

Q* = --^ (160) 

Here, we see that in the limit, the helicity zero field does not decouple if 



Feldman 385 

Although the equation (153) together with the constraint equations (120) and 
(119) (with source conserved and m = 0) make up the content of the massless 
theory, the fact that the helicity zero field <f> does not decouple shows that the 
m -* theory is not the same as the massless theory. 


The main conclusion of this work concerns the difference in behaviour between 
massive spin one and spin two theories as the masses approach zero. The limit 
as the mass approaches zero of a theory of a spin one field coupled to a 
conserved source gives the same observational results (in the classical limit or 
tree approximation limit) as the theory of massless spin one. However the same 
limit of a spin two theory coupled to a conserved source can give the same 
results only if the source is traceless in the limit. This is usually not the case if we 
expect the spin two source to be the energy-momentum tensor. These results 
have been obtained before. By making use of the properties of the Levi-Civita 
tensor and constructing projection operators in spin and helicity space we have 
followed through in detail the properties of the various dynamical variables as 
the mass m becomes small. From the equations of motion we have seen how the 
number of dynamical variables change from 2s + 1 to 2 (where s = \ or 2) 
depending on whether m/0orm = 0. This is due to the gauge invariance 
which the m = theories possess. The presence or absence of gauge invariance 
appears again in our construction of the propagators. For m / the equations 
of motion can be inverted to give a unique propagator. For m = 0, due to the 
gauge invariance, the equations of motion are proportional to spin projection 
operators and can only be inverted by choosing some gauge. We see that for 
spin one, a conserved source allows the m -» limit to be taken and the m = 
results are reproduced for physical observables. For the spin two case a 
conserved source alone does not reproduce the same results as the m = 
theory. One would have to assume the source to be traceless in addition. An 
examination of the dynamical variables and their canonical conjugates allows 
us to see what happens to the supposed disappearing degrees of freedom as 
m -> 0. For spin one we find that the extra helicity zero degree of freedom 
disappears only in the sense that it is decoupled if the source is conserved. For 
spin two the extra helicity one degrees of freedom are decoupled if the source is 
conserved but the helicity zero component does not disappear but is coupled to 
the trace of the source. This again shows why the m -» limit for spin two does 
not reproduce the same observable results as the m = case. 


The author would like to thank Dr. T. Fulton for many discussions which 
aroused his interest in the problem. He would also like to thank Dr. Abdus 

386 Uncertainty Principle and Foundations of Quantum Mechanics 

Salam for his hospitality at the Institute for Theoretical Physics, Trieste, where 
much of this work was done. This work was supported partially by the N.S.F. 


In this Appendix we construct the spin projection operators in the space of 
second rank symmetric tensors in Lorentz space. 

A symmetric tensor in space time has ten components and its spin decompos- 
ition is into spin two, spin one and two spin zero. Its spin two components h^, 
of which there are five, must satisfy 

p"h% = h™» = (A.D 

Thus the projection operator Qf v " e must also have the properties 

n n n (2)<*P _ n (2)^P _ Q (A.2) 

and it must be symmetric in <ji, v) and (a, 0). One easily finds 

Q%* = K(s:-pJ a )(st-pJ p ) + 0* «* ")] 

-toi„-0j.)(v--W) (A - 3) 

(Q (2) ) 2 = Q <2) ( A - 4 ) 


n (2)pcr n (2)aP _ n (2)a0 

The spin one component h™ must have the properties that 


p»h { ? = Q 

The projection operator Q™" which must be orthogonal to Q l2 l aP is found to 

Q% a ' = 1 ipJ a (S' v -p^)+pJ (Sl-p^") + ^^^ 

= $pj a 8 e .+pj l3 8:+ o* «* »)] -iPM a f ( A - 8 ) 



and again 


Q (2) Q (1) = (A. 10) 

Since the symmetric tensor h^ is a reducible representation of the Lorentz 
group, there is no unique decomposition of the two spin zero parts. In fact there 
is generally a one parameter infinity of two orthogonal projection operators. 

(Q (1) ) 2 = (1) 

Let us define two projection operators, 



BJl-PvPvP p 

A Z = A 
B 2 = B 

Feldman 387 

(A. 11) 

(A. 12) 

(A. 13) 
(A. 14) 
(A. 15) 

(A. 16) 
(A. 18) 

AQ (2) = AQ m = BQ m = BQ m = 

Let us define 


Ct = kv^"P +Pj,V ae ) 
and we have the following identities 

ABA = A/4 (A. 19) 

BAB=B/A (A.20) 

AC+ CA = A/2 + C (A.21) 

BC+CB=B/2 + C (A.22) 

C 2 = (A+B + Q/4 (A.23) 

We find that the most general spin zero projection operator can be written 

Q™ = aA+bB-cC (A.24) 

where a and b are the two roots of 

4x 2 -2(c + 2)x+c 2 = 


for any c. The roots will be real provided 

2^c^-| (A.26) 

For the case c = 0, either a or b = 0, i.e. A and B are projection operators. 
Another special case is 

c=| (A.27) 

388 Uncertainty Principle and Foundations of Quantum Mechanics 

in which case 

<^%-Q (0) 4(A + f-c) 


-|(0)a/3 _ 1 



For a given Of one can find a second projection operator Of orthogonal to 


O< 0) =.aA+6£-cC 

where we find 

a+d -b+b-c+c =f 
Using (A. 11) and (A. 12) one can demonstrate that 

(O (2) + O (1) + O< 0) + 6f)^ = §(S^+<5^) 




We now write the operator K% defined by equation (19) in terms of these 
projection operators. First, we have 

K£ = [(p 2 -m 2 )Q™-m 2 Q m -i(p 2 -m 2 )A-l(p 2 + 2m 2 )B 

+ U2p 2 + m 2 )C]% (A-33) 

We must now find X(p), y(p)andc(p)suchthattheA,B, C part of (A33)can 
be written 


To do this we must solve the equations 

aX+aY=-*(p 2 -m 2 ) 
bX+bY=-l(p 2 + 2m 2 ) 
cX+cY=-1(2p 2 + m 2 ) 



where we also use (A.31) and (A.25). We find that X and Y are the two 
solutions for u of 

u 2 + 2u(p 2 -m 2 )-3m 4 = 





Feldman 389 

This implies neither X nor Y vanishes for any value of p 2 . 
We take Y as 

For small m, 


In addition 

y= -(p 2 -m 2 )+[(p 2 -m 2 ) 2 + 3m 4 ] i 

2p 2 

X= -(p 2 -m 2 )-[(p 2 -m 2 ) 2 + 3m 4 f 
= — 2p 2 for small m 

4p 2 + 2m 2 2 

a = — 

3 X-Y 3 

4 (p 2 -m 2 ) 2 
3 X-Y 3 


Indeed as m -*■ 0, 

2p 2 -4m 2 2 
b = + 3-JTY- + 3 





;O (0) 

is given 

by (A.29) 


6f-+6 (0) = 



as m 



«» + YQ«»->- 

-2p 2 Q m 

which produce equation (43). 
For the inverse of K we have 

G afi =( - 


)<2)__L d) , 
\p~ — m~ m X 

We look for the small m limit of 

o^ + o^ 









390 Uncertainty Principle and Foundations of Quantum Mechanics 

We have 

Q< 0) <% 0) JaY+aX)A+(bY+bX)B-(cY+cX)C 

X Y 



aY+dX=+i(p 2 -m 2 )+i(X+Y) = (A.54) 

6r+Mr=-!(p 2 -4m 2 )+|(AT+y) = -2(p 2 -2m 2 ) (A.55) 
cY+cX = i(p 2 + 2m 2 ) + l(X+Y) = 4m 2 
Using (A. 3 9) we have 

^-+^- = -^[(p 2 -2m 2 )S + 2m 2 C] (A.56) 

Thus for small m 

( \<^j£ls = |E> 


X Y 3m 4 3m 
This gives equation (55) for small m. 


In this Appendix we construct the helicity projection operators in the space of 
vectors and second rank symmetric tensors in Euclidean three-space. 

A vector A, in three-space has three components: two helicity one compo- 
nents, A ( l\ and one helicity zero component A ( °\ The helicity one components 
are combinations of the transverse components of A, and so must satisfy (in 
momentum space) 


The projection operator R\P into this space is 

R l ^ = (vu+PiPi) 





PtP =-1 


R£ ) R Wkl = R\ 1) ' (B.4) 

The helicity zero projection operator R^ must be orthogonal to R m and is 

Rf = -m (B.5) 

A symmetric tensor in three-space has six components and its helicity 
decomposition is into helicity two components (2), helicity one components (2) 

Feldman 391 

and two independent helicity zero components. The helicity two components 
h i ; 2) must satisfy the relations 

p%f = h i2)i = (B.6) 

The projection operator R\f )rs into this space is (in momentum space) 

r? rs = Wi+mw+Pp) + (»■ **m-kv„ +m)(v rs +pT) (b.7> 

and, of course, 

(R (2) ) 2 = R (2) (B.8) 

The helicity one components h™ must have the property 

hQ^dJiV + drf" (B.9) 


p'hf^O (B.10) 

We find 

R\l )rs = -kpp'iS'i+pW+PiP'W+pWHi^j)] (B.H) 

(R (1) ) 2 = R a) (B.12) 


R (1) R m = (B.13) 

Just as in Lorentz space (in Appendix A) there will be an infinity of the two 
helicity zero projection operators which can be made up of linear combinations 
of the operators 

hiv rs ,mp r p s , -Hw'p'+pm") . (B.14) 

However from equation (1 19) we know which helicity zero component of /r„ is 
not a dynamical variable, namely the one which is proportional to 

Ajjh" (B.15) 

The helicity zero projection operator which projects this field out of /i„ is 

Rf rs = A ii A rs /A mn A m " (B.16) 

Thus the helicity zero component which will be a dynamical variable will be 
obtained by finding that helicity zero projection operator R (0) which is 
orthogonal to R (0) . This operator is easily found to be 

n(0)rs_ m I 

R " ~2Z\ 

2p 2 + 3m 2 


v«+ — —* — PiPijyn + — ^2 — ppJ (B.17) 


Z = (2/ + 4p 2 m 2 + 3m 4 ) 


392 Uncertainty Principle and Foundations ol Quantum Mechanics 

Accordingly, the dynamical helicity variables are 

hf=Rf rs h rs with a = 0,1, 2 
and the variables canonical to these will be the 

with IT„ given by (121) 
We have 


We may write 

R\r° A rs =0 

R (o) R° = 

Ylt ) = R ( i ? rs Ar k K l ti k ' 



where we have neglected the terms depending on the source r* since they will 
not contribute to the canonical commutation relations. We have 

A*AjiM*<»+^W^)> + $ m ] f (B - 24) 
L p +m \p +m / Jrski 



M0)kl_ * * *k*l 
Srs =PrP s P P 

ST'= l 2(Vr S +P&)(V k '+p k p') 

Using (119) and neglecting source terms we may write 


(Vki +PkPi)h kl = 2 + m 2 WV»' 

Also from (B.21) we may write 


Rf"(Vrs+PrPs)=-^2Rf rS PrP ! 

p +m 



This gives 

Finally we have 

(R (0) S°) k !h kl = \{-^){R°S^!h kl (B.29) 

tf? = Rf"h rs 

m i >= 



p +m 

-K a n rs ~ 7 a ii n, 

ij 2 , 2 JV '/ "" „ 2 '' rs 


Feldman 393 

for small m, and 

Substituting (B.17) for R (0) and keeping only leading terms for small m we 

U (0) 3(m_Y v (ow 
" 2\p 2 ) " " 

Similarly for small m we may write 




Boulware, D. G. and Deser, S. (1972) Phys. Rev., D6, 3368. 
Deser, S. (1970) /. Gen. Rel. Grav., 1, 9. 
Fierz, M. and Pauli, W. (1939) Proc. Roy. Soc., 173, 211. 
Van Dam, H. and Veltman, M. (1970) Nucl. Phys., B22, 397. 

Relativistic Electromagnetic Interaction 
Without Quantum Electrodynamics 


University of Wisconsin, Madison, Wisconsin, U.S.A. 



University of Chicago, Chicago, Illinois, U.S.A., and 
Ohio State University, Columbus, Ohio, U.S.A. 


In the extension of the Dirac theory of the electron to many-electron systems, 
the most obvious difficulty is a satisfactory treatment of relativistic elec- 
tromagnetic interactions. The first such treatment was worked out many years 
ago by Breit (1929, 1930, 1932). A more elaborate derivation of Breit's results, 
based on quantum electrodynamics, was given by Bethe and Salpeter (1957). 
While quantum electrodynamics is thus capable of describing relativistic 
electromagnetic interactions, there are in principle and in practice still serious 
difficulties in this approach. 

One reason for this situation is that relativistic effects are intrinsically far 
from simple. However, quantum electrodynamics complicates matters by 
yielding results which are not readily absorbed in the framework within which 
one naturally deals with many-electron problems. Thus quantum elec- 
trodynamics is organized around a perturbation treatment that regards quan- 
tum electrodynamical effects as weak. But in many-electron systems the 
quantum electrodynamical effects include the Coulomb interactions between 
electrons, and these cannot comfortably be regarded as weak. Again, quantum 
electrodynamics tends to sidestep the question of the influence of quantum 
electrodynamical effects on the wave function. Indeed, it may be inconsistent to 
take this up, since one of these effects is the interaction of an electron with its 
own field, and even a finite retarded self-action compromises the definition of a 
wave function (Feynman, 1948). Treatments of many-electron systems, on the 
other hand, customarily are formulated in terms of wave functions. 

These considerations suggest that a vantage point somewhat different from 
that offered by quantum electrodynamics may be useful in treating relativistic 


396 Uncertainty Principle and Foundations of Quantum Mechanics 

electromagnetic interactions. We present in this paper a formalism that pro- 
vides such an alternative approach. 

Since one of our principal objectives is to deal with relativistic electromagne- 
tic interactions in terms of a vocabulary that differs from the one customary in 
quantum electrodynamics, we begin in Section 1 and 2 with a brief outline of 
this vocabulary. Our extensive use of matrix elements is in the spirit of the 
original formulation of quantum mechanics by Heisenberg (1925), Born 
and Jordan (1925), Born, Heisenberg and Jordan (1926) and Dirac (1926). 
Section 3 reviews Dirac's treatment (1928) of a single relativistic electron. 

In Section 4 we take up the actual treatment of relativistic electromagnetic 
interactions. Our considerations apply to an isolated system containing an 
arbitrary number of Dirac particles. Of course our assumption that the system 
is not subject to external disturbances is only an approximation, since real 
physical systems can always interact with external objects through emission or 
absorbtion of radiation or other means. Here we accept the limitation this 
approximation imposes on our treatment, since our interest is in interactions 
between particles. This is necessary in order to deal with a total energy for the 
system which is conserved and therefore capable of precise definition. 

We deal here in terms of many-particle wave functions, and this implies that 
the number of particles remains constant. Thus no provision is made for the 
possible creation and/or annihilation of electron-positron pairs. 

No additional approximation need be adopted in treating the relativistic 
electromagnetic interactions. We find that these interactions can be handled in 
closed form, using the version of classical electrodynamics advocated by 
Wheeler and Feynman (1945, 1949). Just as we desire, this formulation deals 
directly in terms of interparticle interactions, without reference to a mediating 
electromagnetic field. In such a theory there is nothing that compels one to 
include the action of a particle on itself, as Wheeler and Feynman demonstrate. 
Following them, we omit such self-actions, thereby avoiding all of the difficul- 
ties associated with those terms in conventional quantum electrodynamics. 

Quantum mechanical treatments based on the electrodynamics of Wheeler 
and Feynman have previously been presented by Hoyle and Narlikar (1974), 
and also by Davies (1971, 1972). However there is little in this work that bears 
directly on our approach: while these treatments differ in some respects from 
conventional quantum electrodynamics, they adopt similar vantage points and 
vocabularies. Thus, from our point of view, they have essentially the same 
drawbacks as conventional quantum electrodynamics, including, rather unex- 
pectedly, the occurrence of self-interactions. 

In the remaining sections we take up the Pauli approximation. The results 
are significant in their own right, even though we simply reproduce long- 
accepted theoretical expressions. Careful examination of previous derivations 
reveals that they are somewhat inadequate. One problem is that these deriva- 
tions start from an approximate treatment of the interactions. This is hardly to 
be avoided in a treatment based on quantum electrodynamics since this theory 
is organized around a perturbation treatment, but it places one in the awkward 

Detrich and Roothaan 397 

position of extracting one approximation from another. A more severe prob- 
lem comes from the fact that quantum electrodynamics avoids evaluating 
electrodynamical effects on the wave function. To make up for this, one is 
obliged to adopt some more or less arbitrary assumptions for such effects. For 
example, Breit (1929, 1930, 1932) found that part of his approximate interac- 
tion term should not be included in the Hamiltonian used to determine the 
wave function, even though its expectation value contributes to the energy. 

A derivation of the Pauli approximation which is free of such difficulties is 
presented in Section 6. InSection 7 we assess the significance of our results, and 
present a few very general suggestions for further work along these lines. 


For the classical description of a system of N particles, let the cartesian 
coordinates and conjugate momenta be denoted by r M and p^, respectively, 
fi = 1, 2, . . . , N; we shall denote them collectively by 


r = (r u r 2 , . . . , r^) 1 

P JV = (Pi>P2, ■■■,Pn)\ 

There exists a Hamiltonian function H(r N ,p N ); the time evolution of the 
system is governed by the canonical equations of motion 

• = dH 




p ) 


From these equations one can in principle calculate r M (f) and p M (f) for any time 
t, given a set of initial values r^(0), p^O). 

To describe the N-particle system quantum mechanically, we replace the 
coordinates and momenta by appropriate operators. It is customary to use for 
these operators the same symbols as for the classical quantities they replace. 
The coordinates and momenta must satisfy the well-known commutation 
relations, which can be written in the two equivalent forms 




la • p M , 

where [A, Bj = AB - BA is the commutator of the operators A and B, and a is 
any constant vector. 

In general, algebraic equations between classical quantities are reinterpreted 
as operator equations. This process is usually not unique, since any product 
involving conjugate coordinates and momenta becomes dependent on the 

398 Uncertainty Principle and Foundations of Quantum Mechanics 

order of the factors, because of the commutation relations (3). Often the 
requirement that any observable must be represented by a Hermitian operator 
resolves the ambiguity. With that understanding the Hamiltonian operator is 
now defined in terms of the operators r M , p„ by the functional expression 

H(t N ,p N ). 

The replacement of the classical variables by operators leaves undefined the 
operators corresponding to time derivatives of those variables. The time 
derivative of an operator A which does not explicitly depend on the time is now 
defined by 

A = ih~\H, A J (4) 

If equation (4) is used to evaluate r M and p M , one can prove, using the 
commutation relations (3), that the canonical equations (2) are now valid as 
operator equations. 

In the Schrodinger representation, which we shall use throughout this paper, 
the coordinates are taken over without change as multiplicative operators. The 
momenta must then be defined by 

p M = -«7tV M = -ih 

dr u 


in order that the commutation relations (3) are satisfied. 

While classically the time evolution of the system is specified through the 
explicit functions of the time r M (f) and p M (f), quantum mechanically the time 
evolution is expressed in the time-dependent wave function ^(r N , which must 
satisfy the time-dependent Schrodinger equation 

ih— = HV 



This equation can be satisfied by wave functions representing stationary states, 



where E is the energy of the stationary state represented by ¥. The time- 
independent wave function * satisfies the time-independent Schrodinger equa- 

HV = EV (8) 

which states that ¥ and E must be eigenfunction and eigenvalue, respectively, 
of the operator H. 

The determination of the entire set of eigenvalues and corresponding 
eigenfunctions of a given Hamiltonian constitutes the central problem of 
non-relativistic quantum mechanics. Conceptually, the simplest systems are 

Detrich and Roothaan 399 

those for which the entire eigenvalue spectrum is discrete, each eigenvalue 
having only a finite number of linearly independent eigenfunctions; the har- 
monic oscillator is the best known example. 

For the all-discrete case it is natural to denote the stationary state wave 
functions by 

<»m = EJh 


where m is an appropriate discrete index to label the distinct energies, while a 
labels degenerate wave functions if necessary. For the time-independent wave 
functions we have of course 

//^ ma =£„,¥„„, (8a) 

It is well-known that degeneracy is a necessary consequence of the symmetry 
of the problem. A symmetry operator is by definition a unitary operator which 
commutes with the Hamiltonian. The symmetry operators form a group; the 
wave functions V^ which belong to the energy E m transform among them- 
selves under symmetry operations, and the transformation matrices constitute 
a representation of the group. If we allow for the possibility E m =E„, m # n, 
then it is no loss of generality to assume that the representation associated with 
a level E m is always irreducible ; the particular representation to which the.set 
^^ belongs is called its symmetry species. The normal situation is that the E m 
are distinct; whenever E m =E n , m^n occurs, it is called accidental degener- 
acy. Consideration of the symmetry properties of the wave function leads to the 
so-called good quantum numbers ; a corresponding compound labelling of the 
energy and wave functions is usually adopted. For our present purposes, this is 
not necessary; we label the energies with a single index m, and recognize 
non-accidental degeneracy of each level by the wave function label ma. 

It is customary to postulate that the time-independent functions t M consti- 
tute a complete orthonormal basis in the Hilbert space of functions used to 
describe the N-particle system at any given time t. The orthonormality is 
expressed by 

(Vm* |¥*> = \ dr^LCr")^ (O = S„ = 8 mn 8 aP (9) 

where J dr^ denotes 3N-dimensional integration over all particle coordinates. 
Completeness of the base is conveniently expressed in Dirac notation by 

^=X|*maX*,J (10) 


where $ is the identity operator; in ordinary functional notation completeness 
is expressed by 

ma ma ^ 

where V is an arbitrary function within certain reasonable constraints. This 
completeness postulate is fundamental to quantum mechanics; hence only 

400 Uncertainty Principle and Foundations of Quantum Mechanics 

Hamiltonians which yield as their eigenfunctions complete sets can be used to 
describe physical systems. 

Most systems of physical interest do not fall in the all-discrete category. The 
other extreme occurs for a free particle, where the entire spectrum consists of 
all E^O, so that it is all -continuous. Most common is the mixed discrete- 
continuous case; for instance, the hydrogen atom has continuous eigenvalues 
E 2*0, and discrete eigenvalues E m = -R/m 2 , m = 1, 2, . . . , where R is the 
Rydberg constant. Clearly, equations (7a), (9), (10) and (10a) must now be 
reinterpreted. In general, the index m has a discrete and a continuous range, 
while a can be maintained as a discrete index. With this understanding, 
equation (7a) remains valid as it stands. In equation (9), if either morn is in the 
discrete range, no change is required; but if both m and n are in the continuum, 
8 must beVeplaced by a Dirac delta-function. In equations (10) and (10a) we 
nTust deal wu* the sum I ma = I m I a - It is clear that I„ is to be retained as a 
discrete sumAhowever £ m must be understood as a discrete sum and/or 
integration, depending on whether E m is in the discrete or continuous range, 


In our further deliberations we shall restrict ourselves to the all-discrete 
notation. This, however, should not limit the validity of our final results, since 
in the latter all references to stationary states will have disappeared. 

We use the stationary states to define the time -dependent and time- 
independent matrix elements of an operator, namely 

A m „,„,W = <^|A|^> = {dr^L(r JV ,0A(r-,p-)^(r- ) 01 

A ma , n0 -W^A |^> = } dr N VUr N )A (r", p N )V n0 (r") J 

Clearly, A ma ^ and A ma ^ are related by 

A ma ,„p\t) — A man p& m " I 
o mn =<o m -(o n J 


For m = n the time dependence drops out, and we write 

A ma , m p (0 = A mam p = A ma p (i-J) 

We note that the matrix elements of the Hamiltonian are given by 

H ma , n p = E m 8 marn p \L4) 

The matrix elements of the product of two operators are defined by 

-v r r {15) 

this is easily proved using equations (7a), (10) and (11). 

Detrich and Roothaan 401 

For the time derivatives of the matrix elements we invoke the definition in 
terms of the commutator with the Hamiltonian, equation (4); we readily find 

s^ma,np ~" ^^mn^m<x,m3 


We note that the first equation (16) would also result if we defined A manp {t) as 
dAma,„0 (t)/dt; this is of course the justification, in retrospect, for the definition 
of the time derivative of an operator, equation (4). The second equation (16) 
extends this definition by providing a time derivative for the matrix elements 
A ma ,n0, even though these are time-independent quantities. 

Quantization of classical relations can now be stated in terms of matrix 
elements, preferably the time-independent ones. One simply replaces classical 
quantities by the corresponding matrix elements, honouring the rules for 
products and time derivatives, equations (15) and (16); of course ambiguities 
due to the order of factors have to be resolved in the same manner as for the 
operators. Clearly, quantization in terms of matrix elements is completely 
equivalent to quantization in terms of operators, and vice versa. 

From the matrix elements and wave functions one can recover the operators. 
In general 

"■ 2- L, I ^moMm«,iijS\ * np\ 
ma n& 

For the Hamiltonian we have the special formula 

H = V hlr \P Air I 

* l Lj * ma / J -'m \ * ma 



In addition to the operators for the positions and momenta of the particles, 
we shall need operators representing the charge and current densities 
associated with the particles. We define the charge density associated with the 
/ith particle by means of 

P^) = e^-t^) 


Here e^ is the charge of the p. th particle, and 8 (r' - r^ ) is the three-dimensional 
Dirac delta-function, so that the relation 





holds for any function /(r) which is reasonably well-behaved. In equation (19), 
the space indicated by r' designates the position of an external electric probe, 
and thus may be regarded as the observer's space. In the following, we shall use 
r', r" for electromagnetic probe positions, and r M , r„ for particle positions, or 
simply r for the position of a single particle. 

In conjunction with the charge density we define the current density operator 
by means of 

j M (r') = (2c)g- 1 ^,p M (r')} 



402 Uncertainty Principle and Foundations of Quantum Mechanics 

where in general {A, B} = AB +BA is the anticommutator of the operators A 
and B, and the velocity r„ is defined by equation (4), specifically 

r M = ift- 1 [H,rJ 

Since r„ and p„(r') are both Hermitian, the symmetric product on the right- 
hand side of equation (21) guarantees that j. (r') is also Hermitian. Note that we 
have used the electromagnetic rather than the electrostatic definition of the 
current density; j^r') has the same dimension as p^r'). 

Up to this point we have not made any use of the specific form of the 
Hamiltonian. For our system of point particles the non-relativistic Hamiltonian 

H(r N ,p N ) = T(p N )+V(t N ) 



T(p N ) = I(2m M r 1 p^p^ (24) 


is the kinetic energy, and V(r N ) is the potential energy. We find easily for the 
velocity of the particle 

r, = m-p M (25) 

and therefore for the current density 

Ur') = (2m |t c)- 1 {p |W p M (r')} ( 26 ) 

We are now in a position to demonstrate that the charge and current 

densities obey the equation of continuity, which also may be called the law of 

charge conservation. Let / M =/(r |1 ) be an arbitrary function of the position r„. 

Clearly /„ commutes with V(r N ), hence 

mfJ~lT,fJ = (27) 

Using equations (4), (5) and (24) we can derive from this the operator equation 
/,- (2m.rU • (VJm) + (V m ) -PJ = ° (28) 

where (VJ„) is another function of position, that is, the operator V M does not 
act beyond the parentheses. On the other hand, taking matrix elements of 
equation (27) we find 

(E m -E n )(^ ma \f^ np )-(2m IJi r\^ ma \^ •p M ,/J|M<^> = (29) 

Obviously, equations (28) and (29) are completely equivalent. 

If we take / M = p„{t') in equation (28), we can replace V M by -V, and obtain, 
using also equation (26), 

P>') + cV'-Ur') = (30) 

We recognize equation (30) as the equation of continuity for the pth particle; 
note that charge conservation holds for each particle separately. The unusual 
notation p„ rather than dpjdt is due to the fact that the observer's space r' was 

Detrich and Roothaan 403 

introduced as a parametric variable. Another useful form of the equation of 
continuity is obtained by taking the matrix elements of equation (30); the result 

'^mnP/i,ma,n/3 ( r ) + » ' ( r ) — 

where the wavenumbers k mn are defined by 

k mn = <o mn /c = (E m -E n )/hc 



Finally, we make the observation that equations (27) to (31) are all equival- 
ent. For instance, equation (29) is easily derived from equation (31) by 
multiplying the latter by /(r') and integrating over r'. Note, however, that 
equations (27) to (29) are only meaningful with reference to the non-relativistic 
Hamiltonian; equations (30) and (31) contain the charge and current densities 
formally, and may therefore be valid for other Hamiltonians as well. 


In the one-particle relativistic quantum mechanics of Dirac the wave function is 
generalized to a four-component spinor; in particular, for the stationary states 
we have 

Vs, ma (r,t) = V s , ma (r)e~ 

s = l,2,3,4 


It is useful to consider the index s as another variable in the wave function; it is 
obviously a discrete, variable. 

For most formal considerations, the index s can be suppressed. When that is 
done, we understand that ¥ and ¥ represent column vectors ; the correspond- 
ing Hermitian conjugate row vectors are designated by ¥* and ¥*, respectively. 
Product formation is to be treated according to the rules of matrix algebra; 
ty*<& implies summation over spinor components, and ^ f «I>* is a 4 x 4 matrix. 
The latter is an example of an operator which acts on the discrete variable; in 
general, an operator A is a 4 x 4 matrix with elements A st . If an operator does 
not act on the discrete variable, as, for instance, the position r or the 
momentum p, the 4x4 identity matrix is implied as a factor to make it a 
genuine operator in the world of spinors; we say that the operator is 
Dirac -diagonal. 

It is to be noted that the matrix structure of operators due to the spinor 
character of the wave functions is separate and distinct from the formation of 
matrix elements with respect to stationary state wave functions. The latter are 
defined, as before, by equations (11); however, with the present meaning of 
wave functions and operators, the integrals in equations (11) contain summa- 
tions over spinor components as well as integrations. 

404 Uncertainty Principle and Foundations of Quantum Mechanics 

In the Dirac theory a central role is played by the set of Dirac matrices a x , a y , 
a 2 , |8 satisfying 

a* = a r 


a* = a, 

/3* = 

= a 2 z = fi 2 = I 

{a x , a y } = !«„ « z } = {a y , a J = {<*„ ft\ = {<*„ ft = l«» 0> = ° 


where J is the 4 x 4 identity matrix. The equations (34) express that the Dirac 
matrices are unitary Hermitian and anticommute with one another. While any 
choice of matrices satisfying equations (34) is acceptable, a commonly chosen 
representation is given by 

a r = 

a, = 


It can be shown that any other choice satisfying equations (34) can be 
transformed into the form (35) by a similarity transformation, which, with a 
corresponding transformation of the spinor wave functions, yields the same 
physical results. 
With the help of the operators a, we can now define the Dirac Hamiltonian 

H D = pm'c 2 + cfpc+V(t) 


where V(r) is the potential energy of the particle. Note that both p and V(r) do 
not act on the discrete variable. We wrote m' for the mass of the particle, rather 
than m, in order to avoid confusion with the wave function index m. 

With the Dirac Hamiltonian in hand, and the interpretation of the wave 
functions as spinors, we can now translate most of the formalism of the 
preceding section into the proper one-particle relativistic equivalent. Up to and 
including equation (22) we only need to drop the individual particle labels, and 
replace H by H D . The specific definition of the Hamiltonian, equation (23), is 
replaced by equation (36). In lieu of equations (25) and (26) we now get for the 
velocity and current density 



= ac 

j(r') = ap(r') 

We can again demonstrate the validity of the equation of continuity. We note 
that a function of position /(r) commutes with V and with fim'c 1 , hence 


[H D ,/l-c[o-p,/l = 

Detrich and Roothaan 405 

Using equations (4) and (5) we can derive from this the operator equation 

/-ca-(V/) = (40) 

On the other hand, taking matrix elements of equation (39) we obtain 

(£ m -£ n )<^ mo |/|^)-c<^ m „||[a.p,/]I|^) = (41) 

Equations (40) and (41) are completely equivalent. 

The equation of continuity follows readily in this case if we specify/ = p(r') in 
equation (40), make use of equation (38) for the current density, and observe 
that we can replace V by — V: 

p(r') + cV'-j(r') = (42) 

or, taking matrix elements, 

ik mn Pnu,,np ('') + V • j ma ,„0 (r') = (43) 

Equations (42) and (43) are identical with the non-relativistic equations (30) 
and (31), if we omit the particle index fi in the latter. Note however that the 
resemblance is a formal one, since the current densities are given by different 
expressions, equations (26) and (38), in the two cases. This also accounts for the 
fact that equations (40) and (41) cannot be obtained from equations (28) and 
(29) by dropping the particle index p.. 


The behaviour and mutual interaction of electrically charged particles can, in 
first approximation, be stated in terms of electrostatic forces only. If the 
external field is time-independent, we have a Hamiltonian system, and the 
quantization sketched in Section 2 applies. However if we want to do justice to 
the fact that we are dealing with currents as well as charges, and magnetic as 
well as electric fields, the system is no longer Hamiltonian: the energy cannot 
be expressed only in terms of the instantaneous positions and momenta of the 
particles, since mutual interaction of the particles involves retarded and/or 
advanced potentials. 

Clearly it is a desirable goal to reformulate the problem in a Hamiltonian 
manner: this is the approach taken in the development of quantum elec- 
trodynamics. In that approach, new dynamical variables are introduced which 
describe the electromagnetic field, and the combined system of particles and 
field is considered, to which the rules of quantization are then applied. As is 
well known, this process builds in the self-energy of the particles, which 
presents considerable conceptual and calculational difficulties. 

In this paper we present an alternative approach which avoids the difficulties 
just mentioned. It will be seen that our scheme yields an unambiguous and valid 

406 Uncertainty Principle and Foundations of Quantum Mechanics 

framework to deal with systems of electromagnetic particles beyond the 
electrostatic approximation, at least for moderate energies. 

We wish to retain as much as possible the concepts and methods of the 
preceding sections. Although we do not have a Hamiltonian operator, we still 
assume that with respect to an external observer the system can be described in 
terms of stationary states, as expressed by equation (7a). Since we no longer 
have a Schrodinger or a Dirac equation, the wave functions and energies will 
have to be determined from some other principle; developing such a principle 
is one of the key objectives of this work. 

Inasmuch as the energy is not a relativistic invariant, the assumption of 
stationary states is not a covariant one; rather we have chosen a specially simple 
form of representation for one Lorentz frame, namely the rest frame of the 
observer. Also, a consistent covariant formulation would require N time 
coordinates as companions to the N sets of space coordinates; our wave 
functions maintain N sets of space coordinates, but only one time. 

While we thus adopt stationary state functions of the type (7a), we must of 
course demand that ^ ma and ^ ma are appropriate generalizations of proper 
relativistic one-particle wave functions. We shall restrict ourselves to Dirac 
particles: we permit different masses and charges, but each particle, if alone, 
would be represented by a four-component Dirac spinor. 

In general, for N-particle wave functions the coordinate space is the direct 
product space of the N single particle spaces. Since the four components of a 
Dirac spinor may be considered to arise from a discrete variable capable of four 
values, the discrete aspect of the N-particle product space leads to a spinor with 
4 N components. We label these components by the compound index 

s N = (s u s 2 , ...,s N ) 


where s M refers to the components with respect to the [ith particle, so that 
1 <s M <4. The stationary state spinor wave functions are given by 


a( r U r 2 ■ ■ ■ > r N \ ~~ ^sis 

„(ri, r 2 , . . . , r N )e" 

or, in condensed notation 

%» ma (r N ,t) = Vs», ma (r")e- 



As in the single-particle case, the spinor index s N can usually be suppressed. 
When that is done, we understand ^ and ¥ to represent column vectors, the 
components being ordered by taking s t s 2 ■■■ s N in dictionary order; ^ and ¥* 
are of course the complex conjugate row vectors. Again, matrix algebra 
applies: ¥*$ is a scalar, and ^<t>* is a 4 N x 4 N matrix. In general, an operator A 
is a4 N x4 N matrix with elements A slS2 ... Wlt2 ...,„, or, in condensed notation 
A,",,". Taking the matrix elements of an operator with respect to the stationary 
state wave functions is again accomplished by equations (11), which implies 
complete summations over spinor components as well as integrations. 

Detrich and Roothaan 407 

If a is any 4x4 matrix operator for a single particle with components a s „ we 
define the corresponding operator for the ptth particle, a M , by 

, o sl/1 d S2 , 2 . . . o s ^_ l ,^_ 1 a s ^, i b Si 

H'ji+l * 



In matrix notation, this is expressed by 

a li =lxlx 

.Ixaxlx ...xl 



where x indicates direct multiplication of matrices, and a occurs as the / 
factor. It is easily seen that if a M and b v are any two operators so constructed, we 

[a M ,A„I = 0, (i*v (47) 

Among the one-particle operators which we can construct according to equa- 
tion (46) are the generalized Dirac matrices a^, j8 M , the coordinates and 
momenta r^, p M , and the charge density p M (r' ). 

Since we do not have a Hamiltonian in hand, time derivatives of operators 
cannot be defined by equation (4), but we must in general use the matrix 
element form, equations (16), instead. However, for the velocity r^ we desire a 
simpler definition in terms of fundamental operators associated with the /nth 
particle; this requirement is dictated by the fundamental role played by the 
current density in electromagnetic phenomena. Taking our cue from the single 
particle Dirac formalism, we adopt the generalization of equation (37) as an ad 
hoc postulate, namely 

i IL =OL ll c (48) 

Applying equation (21) we then find for the current density 

Ur') = <V>>') (49) 

analogous to equation (38). 

Another relation which cannot be derived in the absence of a Hamiltonian is 
the equation of continuity; because of the fundamental significance of charge 
conservation, we postulate its validity. Since p\ (r') is not defined as an operator, 
we must take the matrix element form of the equation of continuity, namely 

ik mn p», ma , nfi Of) + V • k, ma ,„0 Of) = (50) 

This is formally identical with equation (31), but the current densities are 
defined differently in the two cases. If we multiply equation (50) by an arbitrary 
function f(r'), integrate over r', and use equation (49), we obtain the equivalent 

(E m -E n )(^ ma \f^ nP )-c(^ ma \lot li •p /i ,/J|^) = (51) 

If we take/^ = 1, equation (51) reconfirms part of the orthogonality relations, 
equation (9), namely for E m ^ E n . A more interesting result is obtained by 
taking / M -* r^ ; we find, using the commutation relation (3) 

""mn* f>L,ma,nf} £**/A,ma,n0 


408 Uncertainty Principle and Foundations of Quantum Mechanics 

Hence r M also satisfies the general definition of time derivatives, equations (16), 
as of course it should. 

We now proceed to the main task of this section, namely to write down a 
valid quantum mechanical expression for the energies E m . Again, the corres- 
ponding operator, H, is not available; nevertheless it is possible to obtain a 
valid expression for E m . We partition E m according to 

E m —E Dm +Ei t , 


where £ D>m is the many-particle generalization of the Dirac energy, and E l<m is., 
the particle-particle interaction energy. 
The Dirac energy is given by 



H D = I (fi^m^c 2 + a„ • p M c) 



is simply the many-particle sum of the individual particle Dirac Hamiltonians, 
without the potential energy. The latter has been omitted since we consider our 
system a closed system of electromagnetic particles; any additional energy is to 
be accounted for in the interaction energy. Obviously if E D , m were the only 
contribution to the energy (we could even add an external potential energy) we 
would have a Hamiltonian, namely H D . We shall see shortly that the interac- 
tion energy E Um cannot be written as the diagonal matrix element of an 

To formulate the electromagnetic interaction energy of the N-particle 
system we follow Wheeler and Feynman (1945, 1949). In this view direct 
particle-particle interaction is paramount, and the electromagnetic field plays a 
subordinate role. The interaction energy is then quantized, that is, reformu- 
lated in terms of matrix elements over charge and current densities; the 
electromagnetic field is never quantized independently. The deviations from 
conventional quantum electrodynamics are important; they may be sum- 
marized as follows. 

(1). There is no such concept as 'the' electromagnetic field with degrees of 
freedom of its own. Instead, there is a collection of adjunct fields, each 
produced by an individual particle, and completely determined by the 
motion of that particle. 

(2). The prevailing field acting on a given particle is determined by the sum 
of the fields produced by every particle other than the given particle. The 
interaction of a particle with its own field does not occur. 

(3). The fields produced by the particles are always taken half -retarded and 
half-advanced. We note that this is the necessary and sufficient condi- 
tion that energy and momentum remain conserved — and therefore 
defined — within a finite, but perhaps very large, volume. In classical 

Detrich and Roothaan 409 

electrodynamics, the half-retarded and half-advanced solution 
describes a system of particles that neither emits nor absorbs radiation. 

In the light of these remarks, we now write down the classical electromagne- 
tic interaction energy 

Ei = £ I I f dr'[^ (•, t)<t> v (r\ t) - ^ (r', t) • A„(r', t)] 


ft V^fJL. 

where in general p M (r', t), j M (r', t), <^(r', r), A^r', t) are the classical time- 
dependent charge and current densities, and scalar and vector potentials 
associated with the /ith particle. The latter are given by 

</>>', = |Jdr" J R- 1 |> /i (r", t-R/c)+p^r", t + R/c)] 
A M (r\ t) = \ J M'R-XU(f, t-R/c)+U{f, t+R/c)] 


where we used the abbreviation 

i? = |r'-r"| (58) 

To properly quantize this scheme we must introduce the time-dependent 
matrix elements for the densities and potentials, as explained above, equations 
(11) and (12). When that is done, the quantized equations (57) can be reduced 
to relations between time-independent matrix elements by dividing by the 
exponential time factor; the result is 

<t>„., ma ,ni3(r') = j dr"i? ] cos {k mn R)p^ ma , nP {r) 
K,™*,nfs(r') = j dr"i? _1 cos (fc TO ^)j^ mo ,^ 3 (r") 


We now quantize equation (56) by introducing time-dependent matrix ele- 
ments in the right-hand side, and observing the rule for product formation, 
equations (15). When that is done, the exponential time factors cancel, and we 
obtain for the quantum mechanical interaction energy 

E i,m=iL I I dr'[p /t>ma>n/s (r')^ )f ^ jma (r')-j #t , mo , Bj8 (r')-A J , >B/s>m „(r')] (60) 

We can eliminate the potentials using equations (59) to obtain 

£i,«=5l I I fdr' f dr"*" 1 cos (*„,„*) 

X [P M ,ma,n/3 (t')p v , n p, ma CO ~ j M ,ma,n0 CO * \v, n fi,m a CO] (6 1 ) 

In this expression, the symmetry of the interaction between particles is 

We now combine the Dirac and interaction energies to obtain the total 
energy. Since we will soon need to consider the energies as functionals of trial 
wave functions, we display the wave functions and energies explicitly wherever 
they occur. We also need to generalize the formula in such a way that 

410 Uncertainty Principle and Foundations of Quantum Mechanics 

invariance with respect to unitary transformations of degenerate wave func- 
tions becomes guaranteed and transparent. The resulting formula is 

E^8 a p={9 ma \H D \9 mfi ) 

+sZ I S f dr'J dr"*?" 1 cos [(E m -E n )R/hc] 

fj, v^ii, ny 

-WnJLrW*,) ' <*«rli.(Ol^>] (62) 

If the factor cos [(E m -E n )R/ he] were absent in equation (62), we could 
carry out the summation over ny using equation ( 1 0) ; E m would then reduce to 
the diagonal matrix element of an operator. Hence it is the retardation- 
advancement effect of the electromagnetic interaction, expressed in the factor 
cos [(E m -E n )R/hc], which prevents the definition of a Hamiltonian for our 
relativistic many-particle system. 

In a Hamiltonian case in general, if the energy expression is considered as a 
functional of a trial wave function, we can apply the variation principle to this 
expression: we demand that the energy be stationary to first order for any 
change in the wave function which preserves normalization. As is well known, 
this leads to the Schrodinger (or Dirac) equation. In the present case we do 
have an energy expression, but even formally there are important differences 
with the Hamiltonian case. 

In the Hamiltonian case the energy E m is expressed, explicitly and bilinearly, 
in terms of any one of the wave functions ¥ ma belonging to the level m. Our 
equation (62) on the other hand is an infinite set of implicit transcendental 
equations, each equation containing all E m and all V np . We consider that 
equation (62) nevertheless defines the E m as functionals; we must however 
expect that each E m is a functional of all ¥„£. 

We now adopt the variation principle in the following form. We demand that 
all E m are stationary simultaneously to first order for any change in the set of 
trial wave functions ^^ which is constrained by orthonormality, equation (9), 
and completeness, equation (10). We furthermore expect that charge conser- 
vation, equation (51), holds for the solution of this variational problem. 

A direct attempt to derive practical equations for the wave functions from 
this principle, as a replacement for the Schrodinger-Dirac equation, does not 
appear to be a simple matter. We shall see, however, that it leads to a 
straightforward and orderly procedure within the Pauli perturbation scheme. 


The Pauli approximation is based on the assumption that the wave functions 
and energies can be expanded using c~ l as the expansion parameter. Writing 


Detrich and Roothaan 411 

we proceed from the assumption that the Hamiltonian, energies and wave 
functions are analytical functions of A. For the sake of clarity, we shall in the 
following display the dependence on A explicitly whenever appropriate; hence 
we shall write H D (X), £ m (A), ¥ M (A). 

The Dirac Hamiltonian, and consequently the energies, are obviously of 
order A ~ 2 . It is convenient to introduce the scaled Hamiltonian and energies 
defined by 

t / (A) = A 2 // d (A) = M + j PA + VA 2 (64) 

£m (A) = A 2 £ m (A) (65) 

The scaled Dirac equation is 

7,(A)¥ m „(A) = e m (A)* ma (A) (66) 

In equation (64) we introduced the operators 

M = pm' (67) 

P = ap (68) 

M is the rest mass operator, and P may be considered the momentum 
magnitude, since the commutation properties of the Dirac matrices assure that 

P 2 = p-p (69) 

It is useful to establish first some properties of the solutions of the scaled 
Dirac equation (66) for A = 0. In preparation for the many-particle case, we 
shall make a distinction between the operators M and j8, although they are in 
this case proportional, see equation (67). We have 

M¥ ma (0) = £m (0)¥ ma (0)j 

/3* ma (0) = r m ^ ma (0) J 

Since /3 is unitary Hermitian, the eigenvalues are given by 

e m (0) = T m m'| 



We say that ^ m „(0) has rest mass ±m' for r m = ±1. At this point the wave 
functions ^ ma (0) are still highly degenerate. In fact, all we can say so far is that 
for positive/negative rest mass the lower/upper pair of spinor components 
vanishes. This leaves completely undetermined the dependence on the space 
coordinate r, and on the spin coordinate, the latter being the discrete index 
labelling the non-vanishing spinor components. 
Because of the commutation properties of the Dirac matrices we have 



/3ij(A)/8 = tj(-A) 
and therefore, from equation (66) 

r, (-A )(3V ma (A ) = e m (A )pV ma (A ) 



412 Uncertainty Principle and Foundations of Quantum Mechanics 

Clearly, the operator j8 maps the eigenf unctions of the level m of 17(A) into 
those of some level, say n, of tj(-A), such that e m (A) = e„(-A). This mapping 
m-*n remains valid for continuous changes in A, and in particular holds for 
A = 0. But for A = the mapping by /3 is stated in the second equation (70), so 
that m = n. Hence we have proved that 

£m(A) = e m (-A) 


and that pV ma M is a linear transformation of ^ m/3 (-A). In Appendix A it is 
shown that the ^ ma (A) may be chosen so that the transformation induced by /3 
takes on the simple canonical form 

<3^ ma (A) = T m ¥ ma (-A) 


We assume in the following that equation (75) always holds. 

We now put forward the Pauli perturbation expansion for the wave functions 
and energies, namely 

* m „(A)= I *™ >P A f 

p = 


«m(A)= I £m,pA 
p = 


where we have limited the energy expansion to even powers of A, because of 
equation (74). From equation (75) follows the important relation 

|3* ma , p = T m (-l) p *„ 


The fact that the leading term in the scaled energy expansion is always finite 
points to a limitation of the Pauli perturbation expansion. Namely for a finite 
value of A there are stationary states of arbitrarily high scaled energy; connect- 
ing these to a finite scaled energy for A = cannot be achieved by a uniformly 
convergent process. Hence the Pauli perturbation expansion is at best a 
semiconvergent process, and is practical perhaps only for states with energies 
close to the rest mass energies. 

We now proceed to determine the wave functions and energies in more 
detail. Substitution of the expansions (76) into the scaled Dirac equation (66) 
should give us the necessary and sufficient equations to determine the wave 
functions and energies term by term for each power of A. An immediate 
simplification is obtained from the relation 

(M-e m>0 )^„ 


' hi 


p evenl 
p odd J 


Detrich and Roothaan 413 

which is easily proved using equations (67), (70) and (77). The equation arising 
from A is identically satisfied; the next four are 

2M* ma , 1 +P^ ma , = 

2M^ m „, 3 + P^ ma , 2 + (y-e m , 1 )^ ma>1 = 

PVma,3 + (V~ £ m>1 )f ma ,2 ~ B m ,2^ma,0 = 


The first equation (79) is solved by 

* ma,l = "'ma,0 

where K is the anti-Hermitian operator defined by 



It is interesting to note that equation (80) permits ^ maj i to be calculated 
directly from ^^o, whereas the latter is still relatively undetermined as a 
solution of a highly degenerate eigenvalue problem. 

Substituting ^ ma> i from equation (80) into the second equation (79) we 

where T is the Hermitian operator defined by 

T=(2mT 1 /3p-p 



For wave functions of positive rest mass the operator (3 acts like the identity, 
and equation (82) becomes equivalent to the non-relativistic Schrodinger 
equation for a particle with spin one half, moving in a field represented by a 
spin-independent Hamiltonian. 

We now proceed to calculate the lowest order relativistic correction to the 
energy, e m2 . Taking equation (80) and the last two equations (79) we can 
eliminate 4 r m „,i and ^3; the result is 

(T+ V-e mti n maa -[K(V-e m JK + £ m , 2 ]* ma , = (84) 

Taking the scalar product with ^ maj0 and using equation (82) we find for the 
energy correction 

e m ,i = <* m „,ol*(£ m ,i - V)K|*™,o> (85) 

The expression (85) can be transformed to yield terms which can be 
interpreted as representing specific physical effects. We eliminate e mA using 
equation (82), applying half of the operator T+ V to the right and the other 
half to the left; we obtain 

S m ,2 = <^m«,o| T + Visile) (86) 

414 Uncertainty Principle and Foundations of Quantum Mechanics 


r = j(K 2 T+TK 2 ) (87) 

V ts = a^,[A,VM (88) 

The operator T represents the relativistic mass correction. Using equations 
(81) and (83) one easily finds 

r=-(2mT 3 j8(p-p) 2 (89) 

It is to be noted that T as given by equation (89) is not properly Hermitian if 
the potential energy V is due in part to point charges. In that case the wave 
function W^o may have a mild discontinuity (cusp) at the site of a charge. 
Hermiticity of an operator involving p depends on partial integration, and the 
vanishing of the surface integrals occurring in that process. It turns out that for 
the wave function * ma , the operator p • p causes no problem in this respect, 
but (p • p) 2 does. In a sense this difficulty was created artificially when we 
replaced e m>1 in equation (85) by T+ V; T and V introduce compensating 
singular behaviour when operating on ^^.o- In practice, a simple remedy for 
the non-Hermiticity of T consists in applying one factor p • p to the right and 
the other one to the left. 

For the further interpretation of V^ we introduce the four-component 
generalization of the two-component Pauli spin vector 

which satisfies 

ct= — 2<aXa 

o , xo- = 2j'ct 



The angular momentum operator which represents the spin of the particle is 

s = §fto- (92) 

If a and b are any Dirac-diagonal vector operators, we have the useful relation 

(a-a)(a-b) = a-b + MT-(axb) (93) 

Equations (90), (92) and (93) are used to derive our final expression for V^ : 

V LS = lm'- 2 [(Wxp) • s+|ft 2 AV] (94) 

The first term in equation (94) is the usual spin-orbit coupling; the second term, 
called the Darwin term, is often said to represent the Zitterbewegung. When V 
is due to point charges, A V vanishes except at the charge sites. Careful analysis 
shows that in this case AV yields a delta-function. Since the two terms of 
equation (94) really belong together, we shall in the following call V^ the 
spin-orbit interaction operator. 

Finally we note that the scheme presented so far does not make allowance for 
an external magnetic field, and the interaction of the spin with it: we introduced 

Detrich and Roothaan 415 

a potential V which can be specialized to Coulomb potentials due to external 
sources, but we did not introduce a corresponding vector potential. The reason 
for our omission is that it is not straightforward to introduce a vector potential 
from an external source without bringing about inconsistencies; in particular, 
the mapping ^ m „(A)<-»^ ma (-A) with the operator /3 is no longer valid. Since 
the primary purpose of this paper is to treat interactions, we shall not dwell 
upon this any further. Actually, in the next section we will treat those magnetic 
and spin effects which arise from the interactions between electromagnetic 
particles, all of which are part of our quantum mechanical system. 


We again consider the wave functions to depend parametrically on the variable 
A = c _1 , and adopt the notation E m (\), ^ ma (A). We generalize the operators 
M, P, K, T, defined before for a single particle in equations (77), (68), (81) and 
(83), for the many-particle case: 

^ = I^=I(2m^r 1 a /i -P^ 
T=I T M =1 (2m^)" 1 /8 tl p /i • p M 


Note that M, P, T are Hermitian, and K is anti-Hermitian. The following 
commutation and anti-commutation relations are easily verified: 

[m,tj = o j 

{MP} = {M, K} = {P, K} = {P,T} = {K, T} = 0\ 
The operators P and T can be written as commutators, namely 

P = \K,M\ 
T=- l 4K,Pj=-^K,lK,MM 



Equations (97) and (98) are specific examples of a particularly useful device: 
namely if we take the commutator of two operators, each of which is the sum of 
one-particle operators, the result is again a sum of one-particle operators. Note 
that there is no analogue of this for an anti-commutator. 

416 Uncertainty Principle and Foundations of Quantum Mechanics 

The explicit formula for the scaled total energy, with A displayed as a 
variable, is readily obtained from equation (62); we write 

e m (\)S aP =<* ma (A)|M+AP|* m „(A)> 

+5A 2 I I I fdr' f dr" J R- 1 cos{[e m (A)-e„(A)]/?/ftA} 

u. i/#u ny J 

fj. v^fi ny • 

x[(^ m «(A)|p^(r')|* n ,(A))<^ nr (A)|p,(r")l^^(A)> 
-<^(A)|j>')hMA)> ' <^(A)|j,(r")^(A)>] 


Similarly we rewrite the various conditions on the wave functions, with A 
explicitly displayed. We obtain for the orthonormality and completeness 




<*m„(A)hMA)> = S ma ,„0 
E|¥ ma (A)><* ma (A)| = ^ 


Charge conservation is expressed by the two equivalent statements 
[ £m (A)-e„(A)K m «,^(r', A)- i\ hV • U m „, n/3 (r', A) = 
[ £m (A)- e „(A)]<^(A)y*„e(A)> 

-A<¥ ma (A)|[P,/jrMA)> = 


P„e(r', A) = <* m «(A)|p,(r')l^(A)> 1 
U m ^(r',A) = (^ m „(A)|j /i (r')|^„ /3 (A)> J 

We emphasize again that f„ is a function of the position r„ only. 

Before proceeding further with equations (99) to (104) we introduce another 
useful operator, namely 

P=PiP2---Pn ( 105 ) 

obviously /3 is unitary Hermitian. We shall see shortly that this operator is the 
many-particle generalization of the one-particle operator for the purpose of 
the mapping ¥(A ) -* ¥(-A ). However, the many-particle has no direct simple 
connection with the rest mass operator M; equation (67), or anything like it, 
does not hold for the many-particle case. In connection with other operators, 
the following commutation and anticommutation relations are useful: 

|[j3,P>')]l = 
«/3,j>')} = 


Detrich and Roothaan 417 

As in the one-particle case, it is important to have in hand the wave functions 
and scaled energies for A = 0. Clearly, for this limiting case our system becomes 
Hamiltonian, with the eigenvalue equation 

MV ma (0) = e m (0)V ma (0) (107) 

Equation (107) is formally identical with equation (70) for the one-particle 
case. Actually, equation (107) is separable into N such one-particle problems. 
We take for the wave functions ^ m „(0) the direct products of the one-particle 
solutions; the eigenvalues are given by 

.,o = I' 

= ±1 


To obtain all possible eigenvalues, one must in general take all possible 
combinations of + and - signs for the different values of p ; this corresponds to 
the individual particle spinors having positive and negative rest masses, respec- 
tively. We say that the spinors with T mji =-1 constitute holes in the wave 
function ^ ma (0). If any of the particles have equal masses, there is additional 
degeneracy for e m (0), but that need not concern us at this moment. For the 
direct product wave functions we have 

/3^ ma (0) = r m ^ ma (0) 

r m =(-l)^-^" 


Note that according to equations (107) and (109) the ^ ma (0) are simultaneous 
eigenfunctions of M and /?; this is of course possible because M and j8 
commute, see equations (106). Evidently r m = 1 or r m = -1 holds for a wave 
function with an even or odd number of holes, respectively. Accordingly we 
call /3 the hole parity operator. The wave function ^ ma (A) does not possess hole 
parity, but ^ ma (0) does possess it according to equations (109). If an operator 
commutes/ an ticommutes with /3 it will preserve/reverse hole parity. 

We now return to equations (99) to (104) for A # 0. With the help of 
equations (106) it is not difficult to see that equations (99) to (104) remain valid 
if we replace each wave function symbol ^ by /3¥, and change the sign of A 
whenever A occurs as an argument of s, p, j or ¥ (not when A appears 
algebraically). Suppressing indices for a moment, we can say therefore that the 
e (-A ) are the same functions of the /3 ¥(-A ) as the e (A ) are of the ¥(A ). Clearly 
the physical solutions for the two cases, which occur when all e(A) and s(—X) 
are stationary, must be mappings of each other. More precisely, there is a 
correspondence m++n so that e m (-\) = e n (A), and the /3^ m „(A) are linear 
transformations of the ^ n(3 (-A). Continuity for A = together with equations 
(109) establishes m-n\ and Appendix A again justifies the canonical form of 
the transformation induced by j8. 

In summary, we have proved that for the many-particle case the dependence 
of the energies and wave functions on the expansion parameter A has again the 

418 Uncertainty Principle and Foundations of Quantum Mechanics 

special properties expressed by equations (74) and (75). We conclude, then, 
that the Pauli perturbation expansion, as expressed by equations (76), also 
holds for the many-particle case, as does equation (77). The latter equation 
takes on new significance for the many-particle case; it expresses that the 
expansion functions * ma , p have definite hole parity, alternating for even and 

odd p. . 

For the charge and current densities the mapping by p yields the simple 


Pp,ma,np (*'>*)- TmTnP^macnP ( F '> A ) 
j^,ma,n^(''> A) = ~T m T n \p, ma ,nf} (*' , _A ), 


We now pursue in more detail the consequences of the perturbation expan- 
sion (76) with respect to orthonormality, completeness, charge and current 
densities, charge conservation and the energy, equations (99) to (104). For the 
orthonormality and completeness conditions the results are simple, namely 


ma *j=0 


For the charge and current densities we obtain the expansions 

Pn,ma,np( T ' > ^) ~ L P,i,ma,n0,p( T M 
p = 


],j.,ma,np ( r > ^ ) = 2- }p.,ma,np,p \ T M 
p = 





For the charge conservation condition we get different equations for the even 
and odd powers of A, namely 

I (£m,q - e n , q )P».,ma, n p,2p-2q(*') ~ lftV ' ' U,nu,,nfi,2p-1^) 




I (e m ,q - e n ,q)P ^mu,np,2p-2q + \^') ~ W ' U,mc,nfi,2p(^) 


Detrich and Roothaan 419 


or the equivalent form 

P 2p-2q 

2li \ e m,q~ e n,q) L* V*ma,2p-2q-rl//il *n0,r/ 
q=0 r=0 

~ 2 "l (^ ma ,2p-r-l\lP, fJ\Vn P ,r) = 
r = 

p 2p-2q + l 

i \ £ m,q ~ E n,q) 2- \*ma,2p-2q-r+UJfi \*np,r) 
q=0 r=0 

-I (V ma ,2p-MP,fJV nf) ,r) = 
r = 

In the first equations (115) and (116) the current term is meaningless for p = 0; 
the correct interpretation is to omit the offending term. 

In order to apply the perturbation expansion to the energy formula (99) it is 
convenient to introduce the functions 

+ nma$ (' > •"> A ) = 

1L L 2* [fip.,ma,ny( r i ^)Pv,ny,mpV^ > ^) — J/i,mo,ny( r > A) • }^„^ m p(f , A)] (117) 
ji. v^p. y 

Note that, because of equations (110) 

F nma p (>"', *", A ) = F nma p (l' , t", — A ) 


Using equation (117) we can rewrite the energy formula (99) in the simpler 

e m (X)S aft =(V ma (X)\M+\P\V mfi (\)) 

+A 2 I [ dr' [dr" J R- 1 cos{[ £m (A)- e „(A)]i?/ftA}F„ maP (r',r",A) 


We now apply the perturbation expansion (76) to equation (119). The Dirac 
term can be handled like the orthonormality condition, and poses no new 
problem. The interaction term is also straightforward with respect to the 
function F nma p(r', r", A), for which we have the expansion 

Fnmapi 1 ', 1 ", A)— £ ^nma/3,p( r \ r ")^ 
p = 

n v 2p 




'"nmaftpl'l' ) 2^22 2, lPp.,ma,ny,2p-q( r )P •>,ny,m0,q{* ) 
q = fj. v^fx, y 

}p.,ma,ny,2p— <}(.' ) ' J v,ny,m0,q'X )} (*■*•*■) 

420 Uncertainty Principle and Foundations of Quantum Mechanics 

The cosine factor in equation (119) poses a new problem, because of the 
manner in which A appears in the argument. We find for this argument the 

[e m (X)-e n (\)]R/h\=Rh 1 I (e m , p - e„, p )A 

p = 



The first term in this expansion is of order A if e mfi = s nfi , and of order A if 
e m0 # Enfi . The cosine will have to be treated quite differently for these two 
cases, and the summation over n in equation (119) has to be split up accord- 
ingly. For this purpose it is useful to define the sets &(m) and «(m) by means 


£n,0 = 

£ n,0 ^ £ m,0 

n J 


so that 

1= I + I 

n n<=&(m) nc<l(m) 


If the argument of the cosine is of order A, we can use the power series 
expansion of the cosine. Hence we have up to order A 4 

cos{[e m (A)-£„(A)]i?/ft}=l-ir 2 ( em , 1 - £ „. 1 ) 2 i? 2 A 2 + (?(A 4 ), n^${m) 


On the other hand if the argument of the cosine is of order A ~\ the power series 
expansion of the cosine is useless. We can however develop the integral in a 
power series of A. We start from an asymptotic expansion which is proved 
in Appendix B, namely the expansion of the operator R' 1 cos (kR) for 

R 1 cos (kR) = -47r5(r'-r") I fe- 2p - 2 (V • V") p (126) 

P =o 

We have to apply this formula, where k is itself an odd power series in A starting 
with A -1 . The result is, to order A 4 

R' 1 cos{[e m (\)-e n (\)]R/h\} 

= -47rft 2 ( £m>0 - e„,o)" 2 8 (r' - r")A 2 + <?(A 4 ), n^<S{m) 


We can now put together the energy expansion up to order A 4 . Because of 
hole parity, we need to consider only even powers; we obtain 

Detrich and Roothaan 421 

i? = <j = 


I fdr'fdr-R-'F^o^r") 

q=0 q=0 

+ I fdr'fdr^F^^r") 

- \tT 2 I ( £m>1 - en>1 ) 2 f dr' f di"RF nma , fi (t', r") 

-4irh 2 I (e m , - e„, )- 2 f dr' f dr"5(r'-r")F„ maP>0 (r', r") 



In the last term of e m2 we could of course have carried out one integration; 
however the expression as given will turn out to be more convenient in the 
development which follows. 

We are now ready to apply the variation principle. In the case of a 
perturbation expansion, the proper procedure is to apply the variation process 
successively for each power of A. We demand of course that the wave functions 
are constrained by orthonormality ; we expect that our variational solutions can 
be chosen so that they are eigenfunctions of /8, and that they satisfy the 
completeness and the charge conservation conditions. 

We apply the variation principle to the first equation (128), which arises from 
A ; we also must honour the corresponding orthonormality constraint, equa- 
tion (1 1 1) for p = 0. This is the usual Hamiltonian variational problem, and we 
obtain of course again equation (107). The solutions ^ ma , are taken as direct 
products of one-electron spinors; £ m>0 is the rest mass of ^ ma ,o- If there are 
identical particles, proper linear combinations of direct products can always be 
taken so that the ^ ma , have appropriate symmetry properties with respect to 
permutations of particles. We feel confident that the eigenfunctions of M span 
the entire Hilbert space, so that the completeness condition (112) is satisfied for 
p = 0. In Appendix C it is shown that the charge conservation conditions which 
depend on W ma>0 are also satisfied. 

We now turn to the second equation (128). First we evaluate the interaction 
term. For the charge and current densities we have the obvious identities 

\M, Plx (r')] = 

j^(r') = (2m^r 1 lM,^j^(r')] 




422 Uncertainty Principle and Foundations of Quantum Mechanics 

Taking matrix elements with the zero-order wave functions we obtain 

(Sm.O - £ n ,o)P M ,ma,n/3,o( r ') = 1 

t, m «,n^,o(r') = (2m M )- 1 ( £m ,o-£„,o)<^ m «,ol/3 F j^(r')|^,o) J 
We conclude from equations (130) that 

P /ljma ,^,o(r') = 0, »c»(m)| 
k ma ,^,o(r') = 0, ncf(m)j 
Using these relations we may write 

Z fnma^olf 1 ! O 

= 2 Z Z Z Z Pfi,ma,ny,o( r )P>viy,m0,o( r ) 

= zl X X<^m«,o|p^r')l^,o)<^,o|p,(r")l^m ft o) 
= 11 X <^ ma ,o|p^(r')p,(r")l^,o) 

= 1Z X e M e„<¥ m ^S(r'-r>(r''-r,,)|¥ m/3 ,o> 

In obtaining the final result (132), we first dropped the current density term on 
account of the second equation (131); next we extended the summation over 
states to all states, on account of the first equation (131); next we used the 
completeness relation, equation (1 12) for p = 0, to carry out the sum over states 
in closed form; and last, we used the definition of the charge density, equation 
(19). Using this result in the interaction term in the second equation (128), we 
can carry out the integrations, and obtain 

Z f dr' f dr"R "^^(r', r") = <¥ ma , | V\V mpfi ) (133) 

where V is the usual Coulomb interaction operator, namely 

V = \! Z e M e„r^ (134) 

with the common abbreviation 

V = l«wl = l , V _r *l ( 136 ) 
If we substitute the Coulomb energy expression (133) into the second 

equation (128), we note that the same result would have been obtained if the 
scaled Dirac Hamiltonian had had an additional term A 2 V. Hence up to order 
A 2 and the energy e ml we do have a Hamiltonian formulation as a valid 


Detrich and Roothaan 423 

description of the electromagnetic interaction. The second equation (128) now 
becomes explicitly 

s m ,iS aP = (V ma , 2 \M\V mlifi ) + (^ ma ,i|M|^ m/34 ) + {V mafi \M\V mfi , 2 ) 

+ <*™,il ^ m 0,o> + <* m a,ol-P|* m 0,i> + (V ma ,o\ V\V mfi ,o) (136) 

We eliminate ^ ma<2 and ^ m p, 2 by taking the orthonormality condition, equa- 
tion (111), for p = 2, multiplying by e m0 , and subtracting the result from 
equation (136). Making use also of the commutator expressions for P and T, 
equations (97) and (98), and of equation (107), we obtain after some manipula- 

s m ,i8 aP = (V ma ,i ~ KV^M- £ m ,o|¥ m/3il - Ky mpfi ) + <^ ma>0 | T+ V|¥ m 0,o> 


We now apply the variational procedure to equation (137). We demand that 
e m>1 be stationary for variations in ^ m « >0 and ^ r ma>1 , maintaining the relevant 
orthonormality constraints, equation (111) for p = and p = 1. As far as the 
variation is concerned, we specifically do not impose the other known condi- 
tions which the wave functions have to satisfy: completeness, charge conserva- 
tion, W ma # eigenf unctions of M and /3, ^ ma ,i eigenf unction of j8. It will be seen, 
however, that the variational solutions obtained permit a choice so that all 
these other conditions are indeed satisfied. The variations with respect to ¥„,«,, i 
and ^o yield, after the usual determination of the Lagrange multipliers 

(M- e^oX*™,! - KV mafi ) = (138) 

(T+ V)V mafi = e m)1 ^ ma , (139) 

Since T and V commute with both M and /3, equation (139) permits a 
solution so that ^ m «, is a simultaneous eigenf unction of M, /?, and T+V. We 
can also say that equation (139) removes a substantial part of the degeneracy 
inherent in the ^ ma , up to this point. For the no-hole solutions, equation (139) 
is equivalent to the non-relativistic Schrodinger equation for N particles with 
spin one half. 

The general solution of equation (138) is 

*ma,l — "inw.O"'" * nux.O 


where ^ is any function of rest mass e m0 . The requirement of hole parity 
however demands that we restrict the solutions (140) to those where ^q and 
^L*,o have opposite hole parity. Hence we must have 

*Uo = 


unless there exists an even subset of the N particles with the same mass as some 
odd subset. Obviously this is a rather special case of accidental degeneracy. It 
cannot occur for a system of identical particles; since this is our primary 
concern, we assume from now on that equation (138) holds, so that 

*ma, 1 ■** ™ma,0 


424 Uncertainty Principle and Foundations of Quantum Mechanics 

Using equation (142), we easily verify the relevant orthonormality and 
completeness conditions, equations (111) and (112) for p = \. For charge 
conservation we again refer to Appendix C. 

We now turn to the third equation (128), which arises from the terms of order 
A 4 . To evaluate the contribution due to F nmaPA we start from the identities 


IK, p>')J = (2m M )" 1 [M, jSjtf, p>')I | 

The proofs are elementary. We note further that from equations (114) and 
(142) follows 

P„, ma , nP ,i(r') = -<*-«,.o|[*, Pm(01|^o> 
U, ma ,ne,i(r')= -W ma , \lK, U^ll^o) . 


Hence taking matrix elements of equations (143) with the zero-order wave 
functions yields 

P^,ma, nft i(r')= -(2m^) \e m ,o-i n ,o)(V ma ,o\PA K >PA t ')l\Vnii,o)} 

(£m,0 £n,o) in,ma,np, 1 ( r ) — 


from which we conclude 

P»,ma,nf},l(r') = 0, «CZ^(m)j 

Uma,^,i(r') = 0, ncS( m )J 


Figure 1 Transformation of triple sum 

Detrich and Roothaan 425 

With the help of equations (112), (114), (121), (131), (142) and (146) we 
proceed to simplify F nmaPA , using techniques similar to those used to derive 
equation (132). We obtain 

Z ^nmafrlfr'* *") 

= 2 Z L L ) 

+P/x,ma,ny,o( r )Pv,ny,mp,2\* ) 
~ }li.,ma,ny, l(T ) * J v,ny,mp, 1 V r )J 

r 2 

= 2L, L* 2j \ L. P n,ma,ny,2-p\ r )P v,ny,mp,pV^ ) 
fjL v^fs. ny *- p=Q 

~ P ii,ma,ny,l(* )Pv,ny,mp,l\X ) 

J^ ) " Ji>,ny,m0,lv* / I 

= jI Z if Z Z" Z <^ m «.2- P -> M (r')|^ r .,>(^ r . r |p,(r")|^, P - r ) 
-(V ma ,o\lK, P„(r')l\^ nyfi )(^ nyfi \lK, p,(r")l|^ m ^ ) 

= iZ Z Z 

2 t 

Z Z Z <*™,,-,|p>0l^-,X^Jp,(OI^,2-r> 


-<*-».olI/^, P>')]II*,, y ,oX^,oll[^, pA<W?»*fi> 

= zZ Z f Z <^ ma ,,|p^(r')p,(r")l^,2-,) 

ft VT±p. '■1=0 

-<*»a,olI^ I*« P^(r')p,(r")+j^(r') • j„(r")JJI^,o>] (147) 
In deriving this result we converted the triple sum over p, q, r by the substitution 

p = 2 + r-t) 
a = s — r J 


the corresponding transformation of the summation limits is 

2 2~ p p 2 t s 

z z z = z z z 

p = 0<7=0r = »=Os=0r=O 


as illustrated in Figure 1. 

426 Uncertainty Principle and Foundations of Quantum Mechanics 

Integration of (147) over r' and r" with the factor R~ l readily yields for the 
first interaction term of g m2 

I [ dr' f dr"R- l F nma ^(r', r") = £ <* ma ,,| V\^ m0 , 2 .,) 

n<=3f(m) ■> J 1 = 


In order to evaluate the second interaction term of e m2 we need to absorb the 
factor (e m ,i — e„,i) 2 into the matrix elements. Such a relation is conveniently 
provided by the equation of continuity, namely the first equation (115) for 
p = l, ncz 3F{m), yielding 

(£m,i-e n ,\)p l L, m , l ,n f i,o(r') = iW •'u, ma , n p,i(r'), n<=&(m) (151) 

With the help of this we obtain 

—fr I (e m ,\-e n ,i) F nmaP #(i',i") 

= —4ft I II I ( e m,l~ e n,l) P ^ma.nyfii* )P i>,ny,mp,o(l ) 

n^&(m) fi. v^ft y 

= -| I I I IF-U^^F-Lw*!^] 

= -5 1 I I W ma ,o\lK, V • UVW^ x <^, |[X, V" • U(r")m mP ,o) 

ft v^fi. ny 

= ~zl I <*™,o|[*v, \K„ [V • j M (r')][V" • j,(r")]ll|^,o) (152) 

Integrating (152) over r' and r" with the factor R, we use Gauss's theorem twice, 
for V and V". We get then for the second interaction term of e m?2 

*~ 2 I (em,i - e n ,x? f dr' f dr"RF nmaPfi (r', r") 
= -sZ I f dr' f dr'X^^.ol^, [#,,, [j>') • V'][j„(r") • V"]i?]]|^ >0 ) 
= -?!! e^e,<^ ma , |^,|[^,(a^-^)(a |/ -V >/ )r M Jl|^ mft o) 

H v*n 

= ll I e^e„<^ ma>0 ||[i^, \K„ rj,a^ • a„ 

-»"^(a M T^)x(a F 'i^Wpn&J (153) 

Detrich and Roothaan 427 

The third interaction term is evaluated in a similar manner: